$ 



bOCUNENT RESUME 

v ♦ 



\ 

ED 206 030 



CS 503 584 



TITLE 



INSTITUTION 
SPONS AGENC? 



~\ 



PUB DATE 
CONTRACT 
GP ANT 
• NOTE 

EDRS PRICE 
DESCRIPTORS 



Speech Research:! A Report on the Status and Progress 
of Studies oi\ the Nature of Speech, Instrumentation 
for Its investigation, and Practical Applications, 
April 1-June 30,-1981. Status Report 66. 

Haskins Labs. , New Haven, Conn. 

National institutes of Health (DHEW) , Bethssda, Md» ; * 
National InSt. of Education (ED)-, Washington, D.C.; 
National Science Foundation, Washington, D. c. 

Bi 

NICHHD-N01-HD-1-2420 • 

NICHflD-HD-01994; NIH-RR-05596: NSF-MCS79-161 77 
297p. 

* ^ 

MF01/PC12 Plus Postage. . • 

Acoustic Phonetics: *Articulation (Speech); Auditory 
Perception: Beginning Reading: *Communication 
Research: Comparative Analysis: Memory; Nouns;* *0ral 
Language: Orthographic ■ Symbols: Second Languages; 
♦Speech Skills: word Recognition; writing 
(Composition) 



AgSTRACT 



Research reports on the nature of speech, 
instrumentation for the investigation of speech, and practical 
applications of speech research are included in this status report 
for the April 1-June 30, 1981, period. The 14 reports deal' with the 
following topics: (1). electromyography as a technique for laryngeal 
investigation, (2) the phonatory mechanism,. (3) phonetic- perception 
of sinusoidal signals, (4) memory for item order and phonetic 
recoding In the beginning reader, (5) perceptual equivalence of two 
kinds of ambiguous speech stimuli, (6) perceptual targets an<l 
production rules,' (7) orthographic variations and visual information 
processing, (8) visual word recognition in Serbo-Croatian, (9) word 
recognition with mixed-alphabet forms, (10) intrhlanguage versus 
interlanguage Stroop effects in two types of writing systems, (11) 
•categorical perception of, English M r” and "1" sounds by Japanese 
bilinguals, (12) the influence of vocalic context on perception of 
•the 'tsV"z M distinction and two "ways of avoiding' it, (13) grammatical 
p^Ujing of inflected nouns, and (14) an evaluation of the "Basic 
Orthographic Syllabic Structure” in a phonologically shallow . 
orthography. (FL) , 



& 



********************** ******** ******* *****£*^** ** ** * ************ *** **** 

♦ ' Reproductions supplied by EDRS are the best that can be made ♦ 

* from the original docum&ftt. ♦ 

**************************************** ****** * * ** ************** ******* 

O * ' _ '• 



CO 

o 

o 

rxj 

CD 



U S- DEPARTMENT of education 
NATIONAL INSTITUTE OF EDUCATION 
EDUCATIONAL RESOURCES INFORMATION 
j CENTER (ERIC) * ' 

^The document has been reproduced as 
? received from the person or organisation 
originating t( 



V 



Minor changes have been made to improve 
reproduction quality 



• Points of view or opinions st^ed in this docu 
ment do not necessarily represent official NlE 
position or policy 



SR-66 (1981) 



' r 

Status Report on 



SPEECH RESEARCH- 



A Report on 

the Status and Progress of Studies on 
the Nature of Speech, Instrumentation 
for its Investigation, and Practical 
* ' Applications 



1 April - 30 June 1981 



/ , 

Haskins Laboratories 
270 Crown Street 
New Haven, Conn. 06510 



✓ 



Distribution of this document -is unlimited. 




. 

(This document contains no information not freely aVailabl# to the 
general public. Haskins Laboratories distributes it primS-ily for 
library use. Copies are available from the National Technical 
Information Service or the ERIC Document Reproduction Service. See 
the Appendix for order number's of previous Status. Reports.) 






2 



V 



o 



SR-66 (1981) 



ACKNOWLEDGMENTS * 



■v The research reported here was- made possible in part by support 
from the following sources: 

" ’ 






. w 



% 



\ 



/ 



9 



National Institute -of C(iild Health and Human Development 
^ Grant HD-0>994 

National Institute of Child Health and Human Development 
•Cbntraot N01-HD-1-2420 

9 « 

National Institutes of Health 

■ Biomedical Research Support Grant RR-05596 

• National Science Foundation. ^ 

Grant MCS79-1 61 77 

* • Grant RRF8006144 * \ 

* S o 

National Institute of Neurological and Communicative 
Disorders and Stroke — ^ 

Grants NS 13870 - 

Grant' NS 136 17 

National Institute of Education 

Grant G-80-0178 ' • 




t 






s 

% 

iii 



3 






\ 



J 



SR-66 ( T 981 ) 



HASKINS LABORATORIES 



' Personnel in Speech R~e search 

Alvin M. Liberman,* President and Research Director j 
franklin S. Cooper,* Associate Research Director • 
Patrick W. Nye , "Associate Research Director 
Raymond C. Huey, Treasurer * *■ 

Alicd^Dadourian , Secretary 



Technical and Support Staff 

* i 

Eric L. Andreasson 
Elizabeth P. Clark 
Vincent Gulisano 
Donald Hailey / » , 

Terry Halwes 
Sabina D. Koroluk 
Agnes M. McKeon 
Nancy O’Brieh 
Marilyn K. /arnell 
Susan Rossf * \ ■ » . 

William P. Scully 
Richard S. Sharkany 
Leonard Szubowicz 

Edward R. Wiley • 

David Zeichner 



Investigators 

Arthur S. Abramson* 

Peter Alfonso* 

Thomas Baer 

-Fredericka Bell-Bdrti* 
Catherine Best* 

Gloria J. Borden* 

£usan Brady* 

Rob'ert Crojfder* 

Carol A. Fowler* 

Louis Goldstein* 

Vicki Hanson 
Katherine S. Harris* 
Alice fiealy* 

Kiyoshi Honda 1 
Daisy Hung 
Leonard Katz* 

Scott Kelso 
Andrea G. Levitt* 

Isabelle Y. Liberman* 
Leigh Lisker* 

Virginia Mann* 

Charles Marshall 
Ignatius G. Mattingly* 
Nancy S. McGarr* 

Lawrence J. Raphael* 

Bruno H. Repp 
'Philip E. Rubin 
Donald P. Shankweiler* 
Michael ^tuddert-Kennedy* 
Betty Teller * 

Michael T. Turvey* 

Ovid Tzeng2 

Mario Vayra3 > 

Robert Verbrugge* 



*Part-time / * 

^Visiting. from University of Tokyo, Japan 
^Visiting from University of California, Riverside ’ 
^Visiting from Scuola Normale Superiore, Pisa, Italy 



Students * 

Claudia Carello 
'Tova dayman 
David Dechovitz 
Steven Eady 5 . . ‘ 
Jo Est-ill 
Laurie B. Feldman 
Carole E. Gelfer 
David. 'Goodman 
Janette Henderson 
Charles Hoequist 
..Robert Katz 
Aleksandar Kostic 
Peter Kugler 

Anthony Levas." 
Harriet. Mag'en 
Suzi Pollock . , 
Patti -Jo Price.. 1 
,Sandha Prindle*’ 
Brad Rakerd 
Daniel Recasens 
Rosemarie Rotunno 
Arnold Shapiro « 
Susan Smith .. 
Rosemary Szczesiul 
Louis Q. Tassinary 
Dougla$ Whalen. 
Deborah Wilkenfeld. 



j 



& 



% 



O 

ERIC 



t 



CONTENTS 



SR-66 (1981) 
(April-June) ' 



I. 



Manuscripts and 'Extended Reports 



ElectromyografJhy as a technique for laryngeal 
investigation— Katherine S. Harris 

. s • . •* 

Investigation o.f the phonatory mechanism — Thomas Baeh 

S * «/ 

Phonetic perception of sinusoidal signals: Effects of 
amplitude variation— Robert E. Remez, Philip E. Rubin, 
and Thomas D. Carrell ^ 

Memory for item order ’and phonetic recoding in the 
' b?gi $t ing reader— Robert B-. -Katz, Donald Shankweiler, 
and xsabelle Y. Liberman 

Perceptual equivalence of two kinds of ambiguous 
speech stimuli — Bruno H. Repp ’ . ■* 

Producing relatively, unfamiliar speech gestures: 

A synthesis of perceptual targets" *and production 
rules — G. J. Borden, K. S. Harris," Hollis Fitch, <• 

and H. Yosfcioka . 

V * 

Orthographic variations and visual irfformation 
processing— Daisy L. Hung and Ovid J. L. Tz'eng ' 

, . ' $ 

Visual word hecognition- in* Serbo-Croatian is 
necessarily phonological— Laurie Beth Feldman » 

Word recognition with mi*ed~alphabet forms — * 

Laurie Beth Feldman and Aleksandar Kostid 

Intra- versus inter-languag ; e Stroop effects in 
two types of writing systems — Sheng-Piqg Fang, . , 

■Ovid J. L. Tzeng, and Liz Alva , 

Categorical perception of English /r/ and /l/ by " 
Japanese bilinguals — Kristine S. MacKain, 

Catherine T. Best, and -Winifred- Strange * * 

! r 

Influence of vocalic context on perception of the 
[S]-[s-] .distinction: V. Two ways of avoiding it — 

Bruno H. -Repp -4 

Grammatical priming of inflected nouns — G. Lukatela, , 
A. Kostid, and M. T. Turvey . 



, 

• • » • • | * 35 

55 

* 67 

79 

4 ► 

...... 85 

....... 1.19' 

...... 1 67 

203 

— . 213 

t 

. 229 

.251 

263 



vn~. 



\ 



/ 



An evaluation of the ''Basic Orthographic Syllabic 
Structure" in a phonologicaliy. shallow orthography — 
Laurie' B. -Feldman, A. Kostid, G., Lukatela, 
and M. T. Turvey • v • . 



fl. F 



Publications 



- \ 



III. Appendix : DTIC and ERIC numbers (SR-r21/22 - SRL65) 



273 

295 

297 



* i 



I 



4f\ 










> . 




* % 






a 








* <> 



* % 



* 




\ 



\ 





* 




/ 



/ 4 

ft ' ;* * 

r ' 

'■*' *■<& 

• , 

ERiC * ‘ 



6 



\ 



Vlll 



s 



I. MANUSCRIPTS* AND EXTENDED REPORTS .. 



\ 

ELECTROMYOGRAPHY AS A TECHNIQUE FOR LARYNGEAL INVESTIGATION* 

' ' ' V 1 

Katherine S. Harris+ • , . 



r 




■While, as earlier papers at this conference have 'indicated, -the .forces 
that determine laryngeal adjustment are complex, muscular forces are extremely 
important. ’ In recent years, techniques for studying muscle activity in 
general have improved, and with- these developments, the study of the laryngeal 
muscles in normal- alert humans has become, possible using the techniques of 
electromyography. In this paper, I will discuss some properties of'muscles, 
and of the laryngeal muscles in particular , J techniques for EMG recording, and, 

■finally some results of studies on the muscul&r control of the lqrynx. » 

* * * 



C . ✓ I 

■ * , MUSCLE PROPERTIES 

The building block for' a consideration of muscle activity is the motor 
unit. This term was coined by Liddell and Sherrington (1925) to include tj^e 
motoneuron and the muscle fibers it supplies. *The contractile response to one. 
impulse in one^ motor neuron is*a twitch contraction in the innervated muscle' 
fibers. Thus , Che^sgial.lest unit of muscular activity, is a contraction of the 
muscle fibers of a single motor unit, and the smoothly graded contraction of a 
muscle is accomplished by temporal an<r x spatial' v summation of t.he activity of a 
nearer of motor units. j £> • ' 

. .The musdles of the .body have' somewhat different tasks, and their 
properties are well-correlated with these tasks. For example ," some muscles, 
such ^s the muscles of the. finge'rs, muit make finely tuned mbvements, while' 
other's, such as those of the leg, must support the body against the force's of 
gravity for long periods- of time. These muscles differ in the size of- their 
motor units, and ift the histochemical properties ‘of the individual muscle 
fiber properties that determine their resistance to fatigue. 



'i ■ ■ ^ . / 

V Table 1 presents some data on motor unit size in the intrinsic -l.ahyngeal 
muscles, with data on one of the eye muscles and the biceps, for comparison. 

^ f . ‘ ■ <■ 



*A version, of this paper was presented at the Conference on Asssessment .of 
Vocal Pathology, ^ethesda, Md., April 1979- (Proceedings to be 'published in 
ASHA Reports .),<■ ~ * 

+Also Graduate Center, City University of New York. 

Acknowledgment. This^ work was* supported . by ■ NINGOS Grants * NS 1387-0 and 
. SS13617, and BRSG Graht RR055.96. . • 

- . r ^ ' ‘ . • 

[HASKINS LABORATORIES: Status Report on Speech Research SR-66 (1901)]- 



Table 1 



Data on the Innervation Ratio of the Intrinsic Laryngeal Muscle* with 
Some Comparison Information on One of the Eye Muscle! and ihe 



Source 



man (Faaborg-Andersen , 
1-957 ) 

nan (English & Bleveos, 
1969) 0 

« (' 

cat > (English & Blevens 
1969) 

t t 1 

man (Buchtal, 1973) 



CT 

• *166 
'30 
55 



’ 'Larynx 
TA IA 



247* 



.90 



PCA 



116 



LCA 



64 






Other 
Rectu^ 
Oculi 
Lateralis 



13 



i* is 



CT 

TA 

IA 

PCA 

LCA 



»Cr icothyroid 
Thyroarytenoid 
Interarytenoid 
Posterior cricoarytenoid 
Lateral cricoarytenoid 



ERIC 



8 



While different authors have found differences in the number of* fibers in a 
motor unit, there is 'a .general .agreement that the laryngeal muscles have low 
innervation ratios, though not quite so low. as those of the eyebal*! and middle, 
ear; tlje muscles of the limbs and trunk have generally* far higher ratios. 

The muscle fibers themselves consist of a nimtber of myofibrils, made* up, 
in turn, of a parallel^ overlapping array of actin and myosin ‘filaments. In 
contraction, the actin and myosin filaments slide relative to each other*, so 
• that the muscle shortens' and develops tension. In nbrmal physiological 
conditions, this shortening*, is initiated by the release of a chemical 

transmitter, acetylcholine, at the jierve-muscle .junction , the* motor end plate. 

* C if* 

When a muscle fiber is at rest, £here is a potential difference across' 
the cell members of about -90 mV, due to the difference, in its permeability to 
sodium and potassium' ions . When a nerve impulse reaches the motor’end plate, 
acetylcholine is released, which changes the permeability of the membrane to 
sodium and potassium ions. If this, ’depolarization reaches sufficient Levels, 
the change in potential becomes self-regenerating, and travels along the 

muscle fU>er. During the passage of this action potential, the membrane 
potential rises, then reverses its sign and finally returns to its resting 

value of -90 mV. The movement of ions, and the associated changes in* 

potential^ are, of course, the events generating the electromyographic signal. 
The ionic, currents at the' membrane apparently release caleium ions within the 
muscles; the diffused calcium , activates the contractile compotient * of .the 
„ musc l e » producing the mechanical effect of muscle shortening or tension 

development (Carlson & Wilkie, 1968). ’. • 

f * « 

While the fibers i of striated muscles share many properties, they show ' 

some -adaptations to their individual tasks. The muscles of the larynx must be , 

. designed foj" rapid adjustment; however, because of their .participation in* 

respiration, they^ must have some capacity * for sustained activity without 
fatigue. Muscle fibers are of two basic types, red ahd white ,’ although there | 
are variants in different systems in different 'animals. The "red" and "white" 
designations refer to a difference in the fiber color, familiar from the light 
and dark meat of- chicken. The two types differ in their metabolic properties, 
with red muscle more suited* to sustained contraction due to the fatigue 
resistance and white morje suited to rapid phasic contraction. Most muscles of « 
•the body, including the muscles of the larynx, show mixed red and white 
fibers. Any single motor unit, however,, is composed of fibers, of* a uniform 
type (Brandstater & Lambert, 1973) although-, since adjacent motor units have 
overlapping territories, a cross-section of a muscle will show a checkerboard 
pattern of- red and white. ' > ’ 

Biochemical and histological studies of the laryngeal muscles to that 
date (1970) were summarized by Sawashima. He concluded that, with respect to 
metabolic properties, the intrinsic laryngeal muscles as a group appeared to 
- pe intermediate between skeletal and heart muscles. However, he found 

disagreements among' the authors he reviewed as to similarities and dissimilar- 
ities within the group. • , - * 



' Since that review, there have been further studies of the histochemist 
of the intrinsic muscles of the larynx. Data from one of them (Edstro 
Lindquist, & Martensson, 1974) are shown in Table 2, showing the percentage's 




r 



' Table 2 " _ 

1 

, « 

Data oji Histbchemical Properties of the Intrinsic Laryngeal Muscles 
in Cal, af,ter EdstrOm,- Lindquist, and Martensson -0 974) 



i ' 



TYPE -I ’ 
( 1 ) ( ( 2 ) 



TYPE II 
( 1 ) ( 2 ) 



(3) 



Fiber .typer in -skeletal muscle 
(Kugelberg, 1973) 



1 V, 



IIA 



IIB 

IIC 






‘Overall % if) laryngeal muscles,, 
with most ^ommon subtype starred 




Table 3 






Data from Atkinson (1978) on/the Mean Repponse Time for Some Intrinsic 
* . and Extrinsic’ Laryngeal Muscles 

• u ' ■ 

*>• -• • . 



V 

A 


Intrinsic 


Laryngeal 


Muscles 


' Strap Muscles 




CT 


— TA ** 

V 


LCA * ' 


ST 


'SH- 


Kfean Response Time 


40 


'15 


15 


• • ‘ 120 


70 










- 




■* 






to 


* 


~ P 



O 

ERIC 



•Of Type I and Type II (red and white) fibers found Tor each of the four 
laryngeal muscles examined. 'While some of the ‘fibers were like Type I and 
Type II fibers found in limb muscles, bthers *were' variants of previously 
identified types. It is interesting . to note that Type II variants 'are far 
more common in the thyroarytenoid than in the cricothyroid. . « 

/ ' I ' . 

A second study (Sahgal & Hast, 1974). examined the hfrstochemical reactions 
to ATP and three ^oxidative enzymes in cricothyroid, and^thyroar ytenoid . The 
results show, some difference* between th§' muscles, which the authors believe 
are I a T^o — cel^ec^, to the differences in the speed of contraction «of the 
muscles. " , 

•• w \ - • , - ' 

Thus, differen ces : in the histochemistry of the muscles appear to 
reflected in their contractile properties. We have seen that the laryngeal 
muscles are composed predominantly o'f Type II fibers, 'like the intraocular 
muscles in man - (Kugelberg , 1973). The laryngeal muscles .are generally agreed 
to be fast muscles, although different authors have obtained different values 
foi? their contraction «time, the time from nerve or muscle stimulation to the • 
peak of the, muscle tension. Figure 1. adapted from Sawashima's review (1970), 
sun^rizes^ the results. The thyroarytenoid is consistently found to be faster, 
than the cricothyroid*, which i's consonant with the difference in proportion of 
Type II fibers in the two muscles and .-'according to Sahgal and Hast (1974), 
with the difference in. their histobhemical properties. 

v* i 

Contraction time for ’the intrinsic laryngeal muscles has' befen" estimated 
'by a very different tec,femique b.y*Atkinson ( 1978) at Haskins’ Laboratories . He 

• reasoned that, if a causal relationship between f Q and the EMG activity of 
various laryngeal muscles were assumed, there, should -be a correlation between 

f o anc * gposs EMG activity, at some time delay determined by the mechanical 
properties of th.e muscle. Thus, cross-correlation analysis £hpuld provide 

• clues to relative contraction time. , i 

. . ' ( . ' . ■ 

He asked speakers to "produce .sentences -varying in stress and intonation," 

thus varying T Qt and cross-correlated average f 0 and rectified and averaged 
EMG activity, at varying delay times. Table 3 shows the deCLcry times at which- 
correlation .reached peak value for different muscles. The finding of shorter 
mean Response time for, thyroarytertoid' and. lateral cricoarytenbid- than for 
cricothyroid,, with^Jonger response times for 'the strap' muscles, is like the 
result's obtained by more conventional techniques, summarized in Figure loan’d 
also parallels the hi stochemical -grouping of TA with LCA, shown 'in Table 2. 



. THE ELECTROMYOGRAPHIC SIGNAL 

— — — T * * 

^ The origin of' the electromyographic signal is discussqd above in only 
very general terms. If' the signalsjfrom the Taryhgeal .muscles' are to be- 
considered in detail, the recording ' procedure itself must b.e discussed. 
Figure 2'(Geddes, 1972) shows f a muscle with a pair of recording electrodes on 
its surface. The fibers are aligned parallel, to each other. When a muscle 
fiber or the nerye is stimulated, a wave of depolarization- passes along'-each 
stimulated fiber. However, since each recording electrode is most sensitive 
to the fiber closest to it, the event recorded will be weighted bv -the 
distance between the pickup and the active fiber, as shown in the figure. As 



7 



* 









& 



CT TA »LAT PCA- TH SH REC SOL 



40 r 



dog 



- 






- 


From Martantton A Skogtui^J 


_ 




0 






- r 




n fl 




Jl 







. * \ 



*4 .J 






_ » | ^ » 

igure't. Contraction time 'in msec for various laryngeal muscles. ) This 
. figure is adapted in part from Table 1 , Sawashima, 1 970 . 

’ ' /'V • 






12 



o 

ERIC 



• 



< 



* 






V 



V 



* 



* 






Figure 2. Schematic diagram of electromyographic' recording. In part (a), two 
electrodes are shown positioned <over six' muscle fibers. In (b), • 
-. # thA^stmuned potential differences ^re shown for electrodes A and B, 

• _ ; with the contributions from each fiber, and their • difference. 

: 'Reprinted ’ from Geddes, - 1972. • J 

• >. 7- 

* 



1 ^ 



& 



r 



i 





the wave of depolarization sweeps down the fibers and reaches the second 
electrode, it becomes negative. The event recorded also reflects the timing 
of the^ action potential passage at the two electrodes and the size of the 
recording surfacg. In the example shown, there is a period when the'fiber is 
depolarized, under both' electrodes-; hence, the signal returns -to zero before 
reversing f *its sign. Another factor determining the signal picked up by the 
electrodes is the intervening tissue. In- general, the presence of tissue 
creates a low-pass filtering effect whose* bandwidth decreases as distance 
increases (DeLuca,- 1978). 



While it is possible to record from a single muscle fiber (Efcstedt’& 
StSlberg, 197 3).. the^ more us,ual recording represents events in a motor unit,* 
br an aggregate of motor units. Under’ normal conditions, an actiqn' potential 
propagating down a motor nerve activates all the . fibers of its motor unit. 
The fibers of a single motor unit are intermingled .with each other in such a 
way *that the territcfcy of one unit^is about 20 'times the cross-sectional area 
of the fibers of the unit .(Buchthal, Erminio,& Rosenfalk, 1959). Since a 
portion of a 'muscle might contain fibers belonging to any of fifty' motqr 
units', an electrode in the vicinity might detect . activity in any or all of 

them. The signal reaching a pair of electrodes in active tissue is the 

‘ weighted sum of the activity of each j > f the fibers of a motor unit, with the 
filtering properties of the tissue between the electrode and the active^fiber 
taken into Account. Since the orientation of the fibers of each motor unit 

with respect to , a fixed recording site will be unique, the shape of the 

. resulting recorded action potential will similarly be unique,' and can be used 
to recognize the unit (LeFever, 1980). * • ' 



P • • ^ V 

When a muscle is activated, the electrical manifestation of a motor Vunit 

action potential is accompanied by a twitch of the activated fibers. Iri 
muscle contraction in physiological conditions, the motor units are repeatedly 
activated K whether the type of contraction is isometric (the muscle, does not 
shorten, but develops tension) or anisometric- ( the muscle shortens). 



THE ELECTRODE . 

In recordings from thq laryngeal muscles, or any others, it is often N 
possil^e £o recognize individual motor units by visual inspection, especially 
'when Revels of contraction are low, /so that only a few motor units are active. 

An example is shown in Figure 3» \a recording from the cricothyroid muscle 
( Faabqt»-£ncle-Rsen , 1964). Alternatively, it is p“ossible\to record from sucn^a 
large?, ndmber of active fibers that individual components cannot be recognizedV 
as -in Figure 4. The signals shown *here - are a so-called "interference 
patte/n." That i,s, the pattern represents the activity of a large number of 
• fibers. experimenter may wish to record single motor units' or interfer- 

ence patterns depending on the pur'pose of the experiment, and makes a choice 
o.f electrode accordingly. , * ' •* 

^ Three general types of electrodes have 'been used in speech research; 
surface, needle, and hooked wire electrodes. Of these ,. hooked wire electrodes 
have been most useful for recording from the laryngeal muscles. The muscles 
of the larynx are aligned in a way that signals picked up by an. electrode on 
the neck surface Sre ambiguous as to, which muscle is the signal source. Thus, 












* * 



m 














/ 




\ 



Figure 3. Action potentials of a single motor unit during phonation. A. 

Cricothyroid muscle. B. Microphone recording. ' Reprinted from Dv 
Brewer, T964. . * -> 



* < 3 . 



r/ 









r 



o 

ERLC 



1 :r 



4 



9 



T 




V 

Figure 4^ Quiet respiration. The onset of inspiration is indicated by the 
vertical, stipplgd lines/ A and B: Cricothyroid muscle. C and D: 

Vocalis muscle. E: Posterior cricoarytenoid muscle. ' Reprinted 

from D. Brewer, 1964. v „ <» 



' © 

ERIC 



* 




10 



although attempts havo been made to use 'surface recordings from locations over 
.the thyroid cartilage in a biofeedback, application ^Guitar 7 -S^ 52 _^ it seems 
unlikely*' that much further application will be made pf such techniques. 
Needle electrode insertions into the laryngeal muscles are not generally 
feasible for posterior cricoarytenoid and interarytenoid muscles, although 
such insertions were used by Faaborg-Andersen in his classic study. The work 
of the past decade was done almost entirely with hooked wire electrodes, 
except for some clinical work to be described by Hirose. 

Figure 5 shows the classic version of the hooked wire electrode (Basmaji- 
an & Stecko, 1962). Some technical details and possible variants of this type 
9f electrode are discussed by Basmajian (1978). This type of electrode has 
been used in recording from the laryngeal muscles by a .number of investigators 
besides ourselves (Hirano & Ohala, 1969; Shipp, Fishman, & Morrissey, 1970). 
Osing them, we hav£ been able to record from all of the intrinsic laryngeal 
muscles (and a wide variety of other speech muscles) using techniques 
developed collaboratively with Dr. Hajime Hirose and his colleagues at the 
Institute of Logopedics and Phoniatrics at the University of Tokyo (Hirose, 
Gay, & Strome , 19fl ) . 

If the investigator is interested in^recording from a very srtlall volume 
of tissue, the recording surfaces of the electrodes must be made as small as 
.possible, while iT the Investigator is interested in a representation of the 
\activity‘ of the whole muscle, *the recording surface must be as large as 
possible, while still remaining within the confines of the same muscle. 
Obviously , since the laryngeal muscles are small* some conventional configura- 
tions of electrode may record actiyity from/tnore than one muscle (Dedo & 
Dunker, 1966). In the conventional hooked wire electrode, the hooks,* which 
hold .the wire in the muscle, also act as the recording points for the bipolar 
pickup, through their- cut ends. However, the spacing between the, two points 
is set arbitrarily by the way^ that the electrode happens to hook into the 

muscle, and, indeed, may change within the recording session (Jonsson & Komi, 
1973). ‘ Since,, this type of electrbde apparently records from a very small 
volume of tissue, the fact that the distance between the -electrode tips is not 
fixed seems a design .flaw. At Haskins, we have been exploring the various 
designs in which the functions of stabilization and recording are separated, 
and the field size is fixed by the separation between recording points. 

PROPERTIES ' OF MOTOR UNITS 

* Exploring the relationship between ideal electrode and experiment re- 
quires ‘a systematic discussion- of the events within a muscle ,as we now know 
them, largely from studies of .‘limb ‘muscles. Most issues of 'muscle cbfarac- 

teristics have only been explored with a limited number of muscles. 

. Let us begin with the’ single motor unit. - In constant force contractions, 
it will fire with an overall mean interspike interval an'd ('standard' deviation 
CDeLuca & Forrest, 1973; Figure 6), which can be used to characterize the 

unit, and, perhaps, the muscle itself. MacNeilage (1973) has.^shown that 
single motor units from CT and PCA fire at mean frequencies of ^bout 15 

impulses per se'cond , during low frequency phonation. He suggested that these 
rates were - intermediate 'between rates for limb and trunk and intraocular 






* 



{ 



«• 





A. strarul o£ Nylon. 

Kflxnn.a. AlLoy wi.ee' looped* 
through, a. 2 7^atr£Jc Hypodermic 
needLe 




Figune 5. 
12 




St?ps in making 
needle used for 
1962. 



V 



% V/v J 

a, bipdlar fine-wire electrode with the carrier 
insertion. Reprinted from Basmajian ‘‘and Stebko, 




/ 



*\ 



70 

60 



_ 50 

Q. 

£ 40 
cc 

Ui 

CO 30 

z 



20 







* 






•ft*)' 


it 

i 






MEAN 


• 


138*9 MSEC 


\\ 




K 


so 


• 


115*3 MSEC 


' !; 






SKEW 


• 


.1*07 


w 

. .it: 






VMtN 


• 


5*3 MSEC 


1 




VMAX 


• 


866*9 


mi 


Ij 




Na OF VALUES 


• 


756 


:Jlj 

nil 

uitl 


k 

!! 


y 


* 






ills 


Hi i 
:::: 


iii 

::t i 
:: : : | 

lill? 








' Uu 
jll: 
. iilli 


1 ill 


f, 1 




* 






: r t : : : 








■ 1 K« : : 


I:::!::! H 




* 






TIME ltt MSEC 






4 



\ 






Figure 6. Distribution 6f interpulse intervals from a single' motor unit. 



Reprihted from DeLuca and Forrest, 1973 * 









X 



d 

ERIC 






i n 

X yj 



13 



X 



X 



f - V4 V- ■ -r 






■\ — V \ — V 



— V — V 



-A V-7- 



\ — w — \ — 

— A- -A/ / V— 



rnrTTTrrprz i T r77^~T iai r ^ [ Tfr rr—rr 

'4XUPCT 

^ ^ ^ ^ ^ ^ ^ ^ ^ /)j\ ^jl ^ 

+-4- 



A — j\J[ ^ ^ — >\jl 



>yv ^'L 

W-V — — - — ^ 




Figure 7. Synthetic interference pattern. The interference pattern at the 
bottom is the sum of the twenty "motor units" in the upper line?. 
_ C. DbLuca. : 



O 

ERIC 



20 



, * >v 

< /*,>i 



• * 



ft 






' % 



musculature, as we might expect from these other properties^. "However, he-, 

found no evidence for the different kinds of units, tonic and kinetic; 

'postulated by Tokizane and Shlmazu (1964), to be identifiable' on the basis of 

.the relationship between variability and firing rates (MacNeilage, .Sussman, & 

Powers^ 1977). -Other authors^DeLucd & Forrest, 1973; Hannerz, 1974; Leifer, 

1969) have ,\f-ound continuous distributions of single unit, properties for 

various limb ’muscles. ’ ...*•. 

* / 




During force-varying isometric contractions, there is a complex relation- 
ship between variation in firing rate and recruitment. At low forces, ** force 
tends to be increased by the recruitment of additional units, with successive- 
ly recruited units having' higher firing rates at recruitment. As force 
increases, individual units increase firing rates, and at the Highest for-ce 
levels, very little recruitment occurs. Synchronization of firing of units 
may occur ^a^t be m uscle fatigues (DeLuca, 1978). * , 

The most consistent observation 5 of motor unit ( behavior is the relation^ 
ship between the size of the unit, and force output and order of recruitment 
with increasing muscle force, the "size principle"- (Henneman, 1975). While.- 
this- relationship has not been observed, for any„,of the laryngeal muscle^, it 
has.J>een demonstrated for the masseter in humans (Yemra^-I^?) and' for - the 
anterior belly of the ^ digastric by MacNeilage, Sussman,. Westbury, and Powers 
(1979), aryl there is no reason -to believe that the^ laryngeal muscles behave in 
a very unusual way in this respect., 'However, for all muscles, there is some 
question a£ to whether there are reversals of recruitment order for” rapid, 
anisometric contractions. * 



A 

Since the territories ,of motor units overlap with increasing forces of 
contraction, it is increasingly difficult to identify individual units. Fc* 
•studies of such questions, electrode size must be reduced, and sophistical 
programs for the identification of motor units developed (LeFever, 1 980 ) . 



<rTHE INTERFERENCE PATTERN 

* ' *1 * 

. Most electromyographic studies of the laryngeal muscles have been con- 
cerned, not with t(ie properties ^of individual motor units, but with the 
functions' of the muscles as a whole. Typically, the studies have related the 
characteristics of a given muscle activity* to some sort' of output, such as 
pitch. The electromyographic signal studied is usually .an Interference 
pattern, the signal from a large number of motor - units. As an aid in . 
visualization, it is interesting to look at a synthesized interference 
pattern, Figure 7 (LeFever & DeLuca, personal communication). The figure, 
shows 20 motor units of shapes that would be characteristic of those found in 
an electrode field during a constant force-, isometric contraction.' Their 
sizes and the relative extent of positive and negative deviations from 
baselinq vary with distance from and orientation to the electrode. The sum of 
positive and negative deviations is’ shown* in the bottom iTine of the figure. 
Obviously, there is ‘summing and cancellation of signals* from individual units, 
depending on *their phase relations. , * Thj^resultant signal-^s noisy, and 
difficult to deal- with 'quantitatively. If«the electrode- size is deduced , so 
that fewer units- ard" represented in the signal; the interference, pattern '* 
becomes more variable as a function of time (figure 8A). 

,V- •' * : 

.* . 

: . - * 21 ' 




a 



4 



i 



cr 



& 



r 



■+ » ' 



& number of steps must be taken to deal with. such, signals.* The jusqal • 
s approach has been to rectify and integrate. The effects of rectification are 
shown'in Figure 8B. The traditional use of the rectified and integrated EMG * 
signal is based on a large body of research investigating the 'relationship 
between the magnitude of thk EMG signal so obtained and the. force output of” 
the muscle ('Bigland & Lippold, 1.954; Bouisset, 1973 * Bouissek &' Maton, 1973; . 
Inman, Ralston, Saunders, Feinstein,. & Wright, 1952; Lippold, 1952; -.Zuniga & 
Simona-, -1969). This measure X "integrated ,EMG") varies roughly linearly with 
force, for isometcic contractions at moderate force letf^ls, but at higher 1 
levels of forc^.the relationsip bfecomes- nonlinear . The situation becomes 'far 
more'- complex for anisom.etric "Contractions, in part' because the mechanical 
efficiency 6f a muscle' defends on its length as well as its, ’velocity of 
shortening or lengthening. Since the events of iriteresj. in speech research ‘ 
are typically of this latter sort, we can expect the magnitude of the EMG 
signal to provide-.rio more than an overall j*ndex of mechanical* performance . 

A possibility that we have ’explored informally at Haskins is osculating 
1 ^ he variance of the interference pattern, which is equal to the sum of the 
variances of the motor unit acEion potential trains contributing, and hence, 
does not' lead to the loss of contributions of motor units due to cancellation 
^s does the more conventional measure. —• 

V _ * 

'* We have said., very little about the time constant , to be uspd for 

< integration. We use a 5 millisecond hardware integration window and smooth 

TVC fc l? er algebraically, using software programs in which a time constant may be 
chosen .. Individual tokens recorded with hooked-wire electrodes show sizable 
' that are* not represented ia the mechanical ojut^ut of the mpscle 

ay a whole. For speech, time-smoothing is useful only to the point where it 
does , not -obscure the sequencing of- underlying articulatory events. An 
alternative way of smoothing is ensemble averaging. The affects of time- 
smoothing and ensemble -.averaging are shown in Figure 9, Which shows ‘ averaged 
and. integrated signals from repeated utterances. The details ■'of these 
analysis procedures are discussed at greater le.ngth in laboratory reports 
IKewley-Port , 1973,' 1974)-. 

11 ^ M. 



c 



V 

/ 



* 4 



i 



\ LARYNGEAL MUSCLE STUDIES 

* ^ 

Having reviewed the general properties of^ muscles, and oj the laryngeal 
> muscles in particular, as well as some technical problems, we turn now to .the j* 
results of electromyographic studies- o,f the function of these muscles ^TftT 
speech. The most primitive question, is, perhaps, -what muscles should be „ 
considered as laryngeal muscles;? Traditionally, .the muscles of the larynx v 
. have been divided into two groups, intrinsic and extrinsic. The identity of * ' 
• the’intrinsic muscles is readily agreed upon; they are the cricothyroids (CT), 

^he thyroarytenoid s (TA) , the interarytenoids (IA), the lateral cricoaryteno- 
.ids (LCA), and the posterior ^cricoarytenoids (PCA). The identity of the 
.extrinsic laryngeal muscles^ is mor,e difficult to specify. If we take the 

empirical point of view that any muscle that affects the positions of thyroid ; 
cricoid-, and arytenoid cartilages relative to each other may be considered to 
be an extrinsic.; laryngeal muscle, then a wide variety of^muscles, not normally 
considerecf in relation to the larynx, must be included! For example, Painter 
(1978) has produced some evidence that ‘genioglossus activity may influence 



• ERIC * 



* c . 



O n 



) s 



17 



4 



Utterance : faznap 



levator Palatini 
unsmoothed smoothed 

MV # 



cVaoral air 
ores sure 




aAA. 



A JlAd 



. token 



token f 



T 



aA.A_ 



cmH 



■sj 



2 



JULjt ilLA. /Uj\ 

0 -600 0 o 




amplitude 

envelope 



pitch 



velar 
elevdt u>n 



-A/a 









La 


H 

A 


1 


cr 

V- 


n 


A 


La 


/N_J 


1 


A 


A 


A 


Ua 

— J t 


V 

/A 


r 

k ( „ 






T 

A 


LA 




/\ — 


3 

r . 




% 

A" 


A 


M "• 

u 

Jl 




V 


V ‘ 


P4 

a* 


lA 



^ (sampling line-up 
reference point) 



Figure 9. Individual ^and averaged tokens for the spoken utterance "faz 
map.." The top row represents averages of 20 tokens. Four tokens 

' are shown 'beneath the average. The first two columns show EMG 

.output from the levator palatini, -after sampling and rectification 
before and after smoothing. The remaining columns show intraoral 
pressure, audio amplitude, fundamental frequency, and measured 
velar height. Haskins Laboratories. 



O 

ERLC 






-24 



pitch, and Erickson, Liberman, and'Niimi ( 1977) Tiave produced , the same sort of 
evidence for geniohyoid. The implication is that ajfyide variety of muscles 
may affect pitch, as Sonnineg__suggested' many years ago ( 1956)*T However, given 
the lack of- detailed information about secondary ^effects ' on vocal fold 
adjustment, only the three stt*ap muscles, the sternohyoid, the thyrohyoid, and 
the sternothyroid will be considered as extrinsics hereT ’ 

5 

", i 

Fundamental Frequency ‘ Control . Electromyographic studies on the regula- 
tion of pitch have been reported by many authprs. More recent 'electromyo*. 
.graphic studies have included those of Hirano, Vennard, and Ohala (1970), 
^Shipp and McGlone (1971), Gay, Hirose, Strome, and Sawashima (1972), and Baer, 
'Gay, and Niirni- ( 1976). ’ ‘ 



. These, studies all conclude that cricothyroid activity increases as the 
pitch is raised, at’ least over most of the pitch range, as we might have’ 

. expected froip the mode of action of. this muscle in producing torque around the 
cricothyroid joint. This action presumably underlies the observed lengthening 
of the folds with increasing f Q . * ^ ^ 

The activity of TA also increases as the pitch is raided over most of, the 
pitch* range, although it is more active in ,chest voice' than in falsetto 
(Hirano, Ohala, & Vennard, 1969; Hirano et al . ,. 19701 Baer et al... 1976), but 
^the function of this activity is obscure". The thyroarytenoid .could act, of 
’course, to- produce a shortening force in opposition . to CT, although this 
cannot be its primary function, since its activity .increases with pitch rise, 
rather than pitch fall. One theory, by van den Berg (I960), as to- its primary 
function suggests that it exerts "medial compression ," limiting the horizontal 
extent of vocal fold vibration, ^permitting the more effective pipy of 
aerodynamic forces.- An alternate possibility is that^ its tension is adjusted 
with compensating adjustments of CT,“ to tune the natural vibrating frequency - 
of the muscle itself, considered as a tissue mass, since the muscle makes up 
the bulk of the folds and so determines, in large part, their vibratory 
characteristics. A secondary problem in the characterization of,,' TA activity 
is that there is disagreement in the literature as to. whether there are 
functional or anatomical differences between lateral and medial (vocalis)’ 
parts of TA* so that an adequate description of the' function of one part may 
not suffice for the other (Sawpshima,' 1970). 

Reports on the other laryngeal adductors, IA, LCA, and the more lateral 
parts of TA, tend to show increasing activity with increasing pitch. Van den 
Berg (I960) suggested, on the basis of cadaver experiments, that the IA might 
be active without the laterals at .very low pitches, but thj.s possibility has 
never been experimentally verified. - ' , 

' t _ 

Some authors (e.g., Dedo , 1970'; Gay. et al.-,. 1972; Baer et al.,,1976) 
report increases 6f PCA activity at the highest f 0 t s when intensity is great, 
although there is not universal agreement. on this point (Shipp & McGlone, > 
1971). Although, this, muscle i/3 t normally an abductor , its activity at high f’ Q 
is thought to brace the arytenoids against the anterior pull of the vocal 
folds. The- observations of* Gay et al. are summarized in Figure 10. 



Control of f 



o by the extrinsic muscles-, of -the laryrjx is less ’ well 
■understood than control by the intrinsic muscles. The larynx, and f Q , move up 



19 



Q - ' 

t J 



EMG Activity in ^jv 




T7 



diirii 



and down during singing by untrained singers, or during speech,* although 
trained singers learn to keep the larynx at ap approximately constant low 
position (Sonninen , 1956; Shipp & Izdebski , 1975). Tjiese . movements are 

produced largely by activity of the extrinsic attachments to the la'rynx, 

, especially by the strap muscles. 

Strap muscle activity (sternohyoid, sternothyroid) is correlated with f 
at both its highest and lowest levels. Although Kakita' and Hiki (Note 1) have 

* ^reported. differentiation among these muscles, the weight of the evidehce is 
' \that they act together in controlling pitch.. This finding is, supported both 

t»y electromyographic measurements ( Faaborg-Andersen & Sorpfn§n, 'i960; Baer et ‘ 
_ al., 1976) and by clinical observation of patients who have had these muscles 
sectioned (Sonninen, 1956). Although, on anatomical grounds, it would seem 
that the sternothyroid muscle ought to increase f Q by’ tilting the thyroid 
cartilage down and forward, and- that the thyrohyoid ought to decrease f Q by 
tilting the thyroid cartilage up and back, Sonninen showed that the situation 
is more complfex . In experiments with cadavers and in stimulation experiments 
with patients undergoing thyroidectomy, he found that the effect on the larynx 
of activity of these muscles depended on posture and head position. The 
sternothyroid, in particular, can tilt_ the thyroid cartilage either way. 

Sonninen developed an "external frame function" theory to account- for f 
raising, based on his own results and those of other investigators. According 
to this theory, all the strap muscles work in conjunction with the anterior 
suprahyoid muscles.. Although the strap muscles .may or may not raise the 
larynx, their main function is to pull the thyroid cartilage forward. At the 
same time, activity of the cricopharyngeus and downward pull of the esophagus 
exert a downward and backward force on, the posterior part of the cricoid 
cartilage / 

Since the mechanism for application of the "external frame function" 

theory - to f^ lowering has been elusive, alternative theories have been, 

advanced. One of these is the passive theory, stating that f Q /iarynx lowering 
i's due^ to relaxation of the mechanisms for f 0 /i ar ynx raising. Although 
. passive lowering can explain some of the observed relationships, two facts 
support^ the notion of at least an ancillary active mechanism. 

Electromyographic activity accompanies lowering as we noted above, and studies 
of yertical’ larynx position show that the position during low frequency 
phonation £'^ lower than that in rest position (Shipp & Izdebski, 1975). ’A 

secbnd theory, attributed to Ohala (1972), suggests that raising and lowering 

the larynx affeots f Q directly through adjustment of the vertical tension of 

• the vocal fold cover, which is continuous with the lining of the trachea. 
This theory cannot be adequately evaluated without improved understanding of 
the vibratory mfechanism of the' vocal folds and actual measurements of 
"yertical tension ( in raised-larynx ancl lowered-larynx . configurations. 
Finally, a theory accounting -for f Q lowering by laryngealization has been 
proposed by LindqvistV 1969) • This theory asserts that the vocal folds are - 
shortened- (-apd , incidentally, transglottal prg^ure is reduced) by activity of 

•.\the muscle fibers of the ar-yepiglottic sphincter. This mechanism does not 
appeah to require lowering* of the larynx' and hence does not explain the 
observed movements or associated EMG activity. It may operate jointly with or 
independently of other mechanisms. 



21 



r 



> * 



O 



* 



( 



1 t 

Results of studies of strap muscle function in speech first suggested 
that although f Q falls were always accompanied by an increase in strap muscle 
activity, the activity did not always precede f 0 falls, and showed substantial 
effects of segmental variables (Collier, 1975; Hirano et al., 1969) . Later 
analysis, however, suggested that strap activity does precede pitch drops from 
a mid to low range (Atkinson & Erickson, 1977; Erickson et al., 1977). 

A problem in studying pitcTTcontrol in speech has been the difficulty of 
analyzing the relationships among f A, . sub^lottal pressur/ST^nd the antecedent 
activity of the large nunber of relevant muscles. One tecnnique, which has 
been found useful cross-correlates f Q and integrated EMg (Atkinson, 1978). 

The delay at wh^ch the correlation reaches a maximum can be used to estimate 
the response time of the muscle.) The magnitude of the correlation at this 
delay can then be used in estimating the magnitude of that muscle's contribu- 
tion to pitch control.. The analysis can be further refined by dividing the 
fundamental frequency range into subranges. AtkirTSbn's study shows the 
contribution of strap muscle activity to be greatest at low frequencies, while 
CT activity has its greatest effects at high frequencies. Although the data 
analyzed in the study were extremely limited , (further exploitation of the 
technique seems warranted. * 

There is, nonetheless, a limit to the amount of reliance one can place on 
the results of gross correlation studies. An ingenious new '-technique for 
studying the relationship of f Q and the activity of the various laryngeal 
muscles has been suggested by Baer (1978). The technique was adapted fr^m one 
originally designed for the study'of skeletal muscles (Milner-Brown, Stein, .& 
Yemm, 1973). Continuous records were made of electromyographic activity from 
laryngeal muscles and of voice fundamental frequency from a subject producing 
steady, sustained phonation at low f Q . The fundamental frequency record 
exhibits small perturbations around a nominally constant value. If we assume 
that these perturbations represent the response to' the firing of single motor 
units in those muscles that control pitch, then an average-rresponse computa- v 
tion^ of fundamental frequency triggered by single motqr unity- firing of any 
muscle should exhibit a systematic deviation 1 in the interval immediately - 
following the firings. Figure 11 shows - the results of following»this 
procedure for CT. Using this technique, muscles whose activity is grossly 
inter-correlated can be uncorrelated to examine their individual effects on 
som£ variable. We feel that tliis technique shows great promise in the 
application just suggested, and others. 



Stricture Control and Voicing Features 

a 

A second dimension of laryngeal adjustment in speech is stricture 
control, the degree to which the laryngeal- sphincter is closed by the 
approximation of the vocal folds. While these adjustments can be used to 

produce overall changes in voice . quality, most speech studies of this 
dimension have been aimed at understanding the mechanism of consonant voicing. 

Fiberoptic visualizations of the glottis (Sawashima, Abramson, Cooper , & 
Lisker, 1970; Kagaya, 197*0 show that voiced and voiceless consoriants are 
characterized by differences in glottal opening. It is the timing of J-jie 
a,bduction and adduction of the folds, relative to the movement of the upper 
articulators, that distinguishes consonant classes within and across 
languages. 

22 " 



O 

ERIC 




i 



X 



\ 

CRICOTHYROID MUSCLE 
^ v N = 19 

Ri = 100 HZ 

L 

■■ -f l Yr^-W i r.mnrnj.1 .g 



RflH EHG: ALIGNED AT SINGLE FIRINGS AND AVERAGED 




ALIGNED AS ABOVE AND AVERAGED 



$ 







\ 



Figure 11.* Single motor units of the cricothyroid, aligned and averaged, with 
parallel measure of pitch perturbation. See text for explanation. 
From Baer, 1981. t 



*% 




23 



* 



O 



Anatomicafly, the five intrinsic laryngeal muscles can be divided into 
three functional groups with respect to stricture control: adbuctor (PCA), v 

adductor (INT, TA, LAT), and tensor (CT). The question can then be asked 
whether the muscles function in speech in ways that the classification would 
suggest. Is there active abduction and adduction in voicing maneuvers? Do 
the adductors function together? Finally, is the activity of adduction and 
abduction accompanied by changes .in tensing? 

Abduction and adduction for voicing are clearly accomplished by the 
action of PCA and INT activity in a reciprocal vjay, as has been demonstrated . 
in a number of studies (Hirose & Gay, 1973; Fischer-Jrirgensen & Hirose, 1974; 
Hirose & Ushijima, 1976). 

Figure 12 shows a fairly typical pattern obtained for this pair qf 
muscles (Hirose, Lisker, & Abramson, 1972). The general conclusion is that 
the abductor (PCA) contracts, • the adductor (INT) relaxes. The relationship 
has been quantified. Hirose (1977) showed that for ? series of utterances 
containing voiced and voiceless stops, produced by k a'* Japanese talker, the 
value of the correlation coefficient ranges between -.85 and -.65. The 

analysis does not make it clear what variables affect the value in a critical 
way. 



The extent to which the activity of the adductor group is correlated in 
such maneuvers is still unclear. Some time ago, van den Berg and Tan (1959) 
showed, in ^cadaver experiments, that the different adductor muscles can be 
used to close the cartilagenous and membraneous parts of the larynx separate- 
ly. Thus, we might expect some differences between the activity patterns of 
INT on the one hand, and LAT and TA on the other. Such differences, have been‘ 
seen in studies of Korean stops (Hirose, Lee, & Ushijima, 1974; Danish stlfd 
( Fischer-Jdrgensen & Hirose, 1974) and glottal stops' (Hirose & Gay, 1 97 3 ) • 
Apparently, the activity of LAT and TA is connected to the necessity for 
strong medial compression in these productions. However, the detail effects 
of differential contraction of these muscles on the shape of the glottis are 
not known. Figure 13 shows the contrast in activity of INT and VOC (TA) for 

the three types of voiceless stop found in Korean. The important point to 

* note, .apart from the obvious overall differences, is that there is a sharp 
peak in VOC activity for the glottalized Korean stop at consonant release, 
probably associated with increased tension of the folds. / 

A recent experiment by Yoshioka (1979) also suggests circumstances in 
which we perhaps will observe differentiation among laryngeal adductors in 
stricture control. He found' that /h/\and/s/ may be produced with equal 
glottal widths, and equivalent patterns^o^Keciprocal PCA and INT activity, 
but still differ in the presence of vibratiori at the edges of the membranous 
portion^ of the folds in some examples of /h/ . An obvious possibility is that 
other intrinsic laryngeal muscles show differences in activity for stricture 
control for the sounds. 

A third question associated with the activity pf the vocal folds in 

voicing control is whether activity of CT is associated with abduction or 

adduction. Stevens' model of glottal activity suggests that the tension of 
the vocal folds will affect the likelihood of vibration, for a given pressure 
drop across the glottis.' It is therefore possible that some stops are 




O 

4 ERJC 








24 




Figure .13, 



Averaged EMG curves for JKT and VOC for the t^ee bilabial stops of 
Korean. [phi] ^ voiceless and aspirate, [pK is voiceless anH 
slightly aspirated, and [p] is voiceless and gW tali zed. From 

Hirose, Lee, and Ushljima, 1974. C 



ERIC 



32 



Br 






Figure 14. Crioothyroid aotivity for the three bilabial stops of Koreap. ' The 
three ourves in eaoh box represent uttei^anoes containing the vowels 
/i/, /a/, and /u/. From Hirose, Lee, and Ushljima, 1974. 



>09 



VJKj 



. 27 



characterized by contrasts in CT activity, particularly those that contrast in 
degree of aspiration, like those of Korean (Hirose et al., 1974). A study of 
stop production in a single speaker (Figure 14) fails to support the 
hypotheses of CT differentiation, but small differences in CT activity 
accompanying voicing contrasts have been found from time to time. 

The brief summary of laryngeal muscle function in this^ section ahd the 
preceding one reveal that we nou have a gross • qualitative sketch of the 
activity patterns, and the technical means at hand to elaborate^his picture, 
to match models and observations of the larynx developed in other ways. 
However, we might now ask what clinical uses, might be made of EMG using 
presently available techniques. 

v 

ELECTROMYOGRAPHY IN FUTURE DEVELOPMENTS 

V 

At present, EMG is widely used in diagnosis of neuromuscular disorders. 
It has not been used this way for the laryngeal muscle's, although it perhaps- 
could be. For example, it seems possible to detect abnormal single motor unit 
firing patterns in these muscles, abnormal synchronization of motor unit 
firings (Hirose, 1977), or, perhaps, to differentiate peripheral neurogenic 
and myogenic disorders. 

% 

Another use, from my point of view a very exciting one, is to use EMG as 
.a technique for examining articulatory programming and its breakdown.. The 
work described in this paper, and others, can be used to show &■ very tightly 
time-constrained coordination of laryngeal and supra-laryngeal events in 
running speech. Aspects of this coordination appear to break down in 
stuttering (Freeman & Ushijima, 1978), and in apraxia (Freeman, B^nds, & 
Harris, 1978). While the broad perceptual consequences of breakdown in 
laryngeal coordination have often been ’ described (e.g., Darley, Aronson, & 
Brown, 1975), it seems far more direct to look at the underlying failures of 
.patterning. One of the most unfortunate consequences of the description of 
normal and abnormal speech in terras of transcriptional entitites has been to 
focus description of speech motor behavior on the attainment or failure of 
attainment of stationary acoustic or articulatory targets, rather than on the 
temporal prescription for coordinated activity. For normal speakers, we need 
to investigate what maintains these prescriptions, by systematically attempt- 
ing to disrupt them. For abnormal speakers, we need, first, to describe the 
disrupted speech in terms of the constituent .articulatory acts, and second, to 
investigate the relative roles of various factors, such as feedback, in' 
- maintenance of existing coordinations. v 



REFERENCE NOTE 



1. Kakita, Y., & ,Hiki, S. A study of laryngeal control for voice pitch based 
on anatomical model?. Paper presented at the Eight International Congress 
on Aooustics, London, July, 1974. * 

4 ‘ 

r • ** 

O A 

yj x 




I 



% .REFERENCES 
" ) 

- 

Atkinson, J. E. Correlation analysis of the physiological factors controlling 
fundamental voice frequency. Journa l of the Acoustical Society of 
America , 1978, 63, 211-222. . f “ ~ 

Atkinson, J. E., & Erickson, D. The function of strap muscles in spgech: 
Pitch iowering,, or jaw opening? Haskins Laboratories Status* Report on 
Speech Research , 1977, SR-49 , 97-102. 

Baer, T. Effect of single-motor-unit firings^, on ' fui^d-amental frequency of 
phonation. Journal of the Acoustica l Society *pf-_ America, 1978. 64 

(Suppl. >1.) , S90 . ( Abstract) „ ° 

Baer, T. Investigation of the phonatory mechanism. Haskins Laboratories 

Status Report on Speech Research , 1981 , SR-66 . ttfis volume. 

Baer ,‘*T. , Gay, T., & Niimi, S. Control of fundamental frequency, intensity, 

and register of phonation. Haskins Laboratories Status Report on Speech 
Research , 1976, SR-45/46 , 175-185. - 

Basmajian, J. Muscles alive (4th, e'd.). Baltimore: .Williams and Wilkins, 

1978. 

Basmajian, J. V., & Stecko, G. A new bipolar indwelling electrode for 
electromyography. Journal of Applied Physiology , 1962, V7, 849. 

Bigland, B., & Lippold, 0. C. J. The relation ‘'between force, velocity and 
integrated electrical activity in human muscles. Journal of Physiology^ 
1954, 223, 214-224. 7 ' 

Bouisset, S. EMG and muscle force in normal motor activities. In 

J. E. Desmedt (Ed.), New developme nts in electromyography and clinical 
neurophysiology (Vol . 1). Basel: S. Karger, 1973, 547-583. 7 

Bouisset, S., & Maton, „B. Comparison between surface and intramuscular EMG 
during voluntary-movement. In J. E. Desmedt (Ed.), New developments in 
electromyography and clinical neurophysiology (Vol. 1). ) Basel: 

S. Karger, 1973, 533-539. * ‘ . t-' 

Brandstater, M. E., 4 Lambert, E. H. Motor unit anatomy. In J. E. Desmedt 
( Ed . ) , New developments in electromyography and clinical neurophysiology 
(Vol. 1). Basel: S. Karger, 1973, 14-22. 

Buchthal ,. F. Electromyography. In A» Remond (Ed.), Handbook of 
electromyography and clinical neurophysioloy (Vol. 16). Amsterdam: 
Elsevier, 1973. , * ( 

Buchthal, F.; Erminio, F., & Rosenfalk, P.- Motor unit territory in different 
° human muscles. Acta Physiologies Scandinavjca , 1959, 45,, 72-87. 

CarLson, F. D., & Wilkie, D. R. Muscle physiology . Englewood Cliffs* N.J.: 

Prentice-Hall, 1968. . / 

, Collier, R. Physiological correlates of intonation patterns. Journal of the 
Acoustical Society of America , 1975, 58, 249-255. 

Darley, F. L\ , Aronson, A. E., 4 Brown, J. R. , Motor speech disorders . 

Philadelphia: W. B. Saunders Company, 1975. 

6edo, H. The paralyzed larynx. An electromyographic study of dogs ^and 

humans. Laryngoscope -, 1970, 80, 145^1517. ‘ . 

Dedo, H., 4 Dunker, E. The volume conduction of motor unit potentials. 

Electroencephalography and Clinical Neurophysioldgfr , 1966, r 20_, 608-613. 
DeLuca, C. Tbwards understandfSg the EMG signal. In J. Basmajian (Ed.), 
Muscles alive (4th ed.). Baltimore: .Williams and Wilkihs, 1978-, 53-78. 
DeLuca, C. J., 4 Forrest, W. J.^ Properties of motor unit action potential 
trains. Kybernetics , 1973, L2, 160-168. 



i 



i ' w • • • 

N Edstrdm, L,, & Kugelberg, E. Histochemical .composition, distribution of 
fibers and fatiguability of single motor 4 piits in the anterior tibial 
muscle of the rat. Journal of Neurology /^Neurosurgery and Psychiatry* 
.1968, 31, 42^433 . ' • . 

Edstrdm, L., Lindquist, C., & Mlrtensson, A. Correlation between func'ational 
and histochemical composition of the laryngeal rausc.les in the cat. In 
B. Wyke (Ed.), Ventilatory and phonatory control* , systems . ■ London: 
Oxford University Press, 1974, 392-403. 

Ekstedt, J., & Stalberg, E. Single .fibre electromyography for^ the study of 
the fticrophysiology of the human muscle. In J. E. Desraegt (Ed.), New 
developments in electromyography and clinical neurophysiology .(Vol. 1). 
Basel: S. Karger, 1973, 89-112. . . * 

English, D. T., & Blevins, C. E. Motor units o& laryngeal muscles. Archives 
of Otolaryngology . 1 969 , ' 89., 782-785. . 

Erickson, D., Liberman", M., & Niimi, S. "The geniohyoid and the .'role -of the 
strap muscles. Haskins Laboratories Status Report on ‘Speech Research, 
1977, SR-49 . 103-110. ' * 

Faaborg-Andersen, K. Electromyographic investigation of intrinsic laryngeal 
muscles in humans. Acta Physiologica Scandinavica,^ 1957, '4.1, Suppl . 140, 
1-149. , : K\ 

Faaborg-Andersen, K. L. Electromyography of the laryngeal muscle^ kn man.* In 
D. W. Brewer. (Ed.), Research potentials , in voice physiology/ New York:, 
State University of New York, 1964, 105-129. r j 

Faaborg-Ancjerson , K.,& Stonninen, A.* The function of the extrinsic laryngeal 
muscles at different pitch. Acta Oto-laryngologica , i960, 5V. 89-93. 
Fischer-Jdrgensen , E., & Hirose, H. A note on laryngeal activity in th'e c 
Danish "stdd." Haskins Laboratories Status Report on Speech Research, 
1974, SR-39/40 , 255-259. - . ■ : T 

Freera§n, F. J., Sands, E. Sj,* & Har.ris, K. S. Temporal coordination of 
phonation and - articulation In a -case of verbal apraxia: A voioe onset 

time study. Brain and Language , 1978, 1 06— T 1 1 . >. ' p 

Freeman, F. J., & .Ushijima, Laryngeal muscle activity during, stuttering,. 

Journal of. Speech and Hearing Research , 1978, 21_. 538-562. 

Gay, T., Hirose 1 ; H., Strome, ^M., & Sawashiraa', M. -Electromyography of the 
intrinsic- lahyngdal muscles during phonation. Annals of* Otology, 
Rhinology and Laryngology , 1972* 8J_> 401-409. . ' ^ ‘ 

Geddes, L- A. Electrodes and the measurem ent of bioelectric events.' Hew 
. . York 



* 



Wiley, 1*972 . . • 

Guitar, *B. Reduction of stuttering frequency using analog . electromyographic 



feedback.- - Journal of' Speech and Hearing Research , 1975 , J8., .672-685. 
’Hannerz, J. Discharge properties of motor units in relation to .-recruitment 

ntary 



order in voluntary contraction. .‘Acta Physiologica Scandinavica,' 1974. 

- -21, 374-384. ' ‘ T"* 5 

Hast, M. H. Studies of . the. .extrinsic laryngeal muscles. 

Otolaryngology . 1968,“ 88, 273-278. ’ ~ , 

Hast, M. H'. The primate larynx^ A comparative' physiological study -of 



Archives of 



intrinsic, muscles-, Acta Otolaryngologica . 1969/ 67, 84-92. 



-^Henneman, E. Principles governing distribution of sensory input to motor. * 
neurons. In E. Evarts (Ed.), Centra l processing of sensory input leading 
to motor output . Cambridge, Mass.: MIT Press,' 1975, 281-293. . 

Hirano, M. , & Ohala, j.. Use of hooked-wire electrodes for electromyogaphy of 
the* intrinsic laryngeal muscles. Journal of Speech and Hearing Research'.’-* 
1969, 1.2, 362-373, r ' 



30 



ERJC , ' 



, O n 

. yJ u 






' L 






Hirano, M. t Ohala, J., & Vennard, W. The ’function of laryngeal vmuscles in 
regulating fundamental frequency and intensity of phonation. Journal of 
Speech and Hearing Research . 1969, 12, 616-628. » 

Hirano, M., Vennard, W., & Ohala, y J. Regulation of register, pitch, and 

. ( intensity of voice. Fol.ia Phoniatrica . 1970, 22, 1-20. 
tJirose, H. Electromyography of* the larynx and other speech organs. In 
\ M. Sawashima & F. S. Cooper (Eds.), Dynamic aspects of speech production. 
Tokyo: University of Tokyo. Press, 1977,. r - * 

Hirose, H., & Gay, T. Laryngeal control in< vocal attack. An electrorayograph- 
• ic study. Folia Phoniatrica , 1973, 25., 203-213. t / 

Hirose, H., Gay, T., & Strome v M. Electrode insertion techniques for 

' laryngeal electromyography. Journal -of the Acoustical Society of 

America , 1§J ; 1 , 50, 1449-1450. — 

Hirose, H., Lee, C. Y., & Ushijima, T. Laryngeal control in Korean stop 

\ production. Journal of Phonetics . 1974, .2, 145-152. 

Hirose,) H., Lisker, L., & Abramson, A. S. Physiological aspects of certain 
laryngeal features in stop production. Haskins Laboratories Status 
Report on Speech Research , 1972,. SR-31/32 . 183-191. 

Hirdse, H., & Ushijima, T-. More on laryngeal control for voicing distinction 
in Japanese consonant production. Annual Bulletin (Research Institute of 
LogopedicS and Phoniatrics, University of Tokyo), 1976, _1_0, 101-112. 
Hirose, H., Ushijima, T., Kobayashi , T., & Sawashima, M. An experimental 
study of the contraction properties .of the laryngeal muscles in the cat . 
Annals of Otolaryngqfcogy . .1969, 78, 297-307. 

Inman, V. T., Ralston, H. J., Saunder, J. B., Feinstein, B., & Wright, E. W/ 
Relation ‘of human electromyogram to muscular tensior 
Electroencephalography and Clinical Neurophysiology , 1952, 4^, 187-194. 
Jonsson, ) B.,* & Komi, P. V. ^Reproducibility problems when using wire elec- 
. trodes in electromyographic kinesiology. In J. E. Desraedt (Ed.),' New 
developments in electromyogr aphy and clinical neurophysiology^TVol . 1). 
Basel: S. Karger, 1973, 540-546. / *• 

Kagaya, R. A fiberscopic and acoustic study of the Ktrean stops, affricates 
and fricatives. Journal of Phonetics , 1974, 2, 161-180. 

Kewley-Port, D. Computer processing of EMG signals at Haskins Laboratories. 
Haskins Labora tories' Status Report on Speech Research, 1973, SR-33. 173- 
184. • . 

Kewley-Port, D. An experimental evaluation of the EMG data {ft*ocessing system: 
Time constant choice for digital integration. Haskins Laboratories 
Status Report, on Speech Research' , 1974, SR-37/38 , 65-72. 



rat hind-limb motor unitsNs In J. E.. Desmedt 
electromyography and clinical neurophysiology 



Kugelberg, E^ Properties of the’ 

( Ed . ) , New developments in 
(Vol . 1). Basel: S: Karger,. 1 973r~2=T37 

LeFever, R. Statistical analysis of concurrently active human motor units.* 
Unpublished doctoral dissertation, Massachusetts ‘Institute of Technology, 

Leifer ,-/ l. J. Characterization of single muscle fiber discharge during 
voluntary isometric contraction of biceps brachii muscle in man. 
Unpublished doctoral dissertation, Stanford University,' 1 969 • 

Lidddll, E. G. T., & Sherrington, C. S. Further observations on myotatic 
/ reflexes. Proceedings of the Royal Society B, 1925, 97, 267-283. 
Li/idqvist, J*. Laryngeal mechanisms in speech^ Quarterly Progress aricf * Status 
' Report (Speech Transmission - Laboratory, *Royal Institute of Technology, 
Stockholm), 1969,' STL-QPSR 2-3 , £6-32. 




* S' 

Lippold, 0. C. J. The 'relation between integrated action potentials in a 
human muscle and its isometric tensions. Journal of Physiology, 1952, 
117 , 492-499. ' • 

MacNeilage, P. F. Preliminaries to a study of single motor unit activity in 
speech musculature. Journal of Phonetics , 1973, J_, 55-71. 

MacNeilage, P. F., Sussman, H. M., & Powers, .R, K. .Activation of motor units 
in speech musculature. Journal of Phonetics ., 1977, 5, 135-148. 

MacNeilage, P. F. , Sussman, H. M., Westbury, J. R. , & Powers, R. K. 

Mechanical properties of single motor units in speech musculature. 
o Journal ofjie Acoustical Society of America , 1979, 65., 1047-1052. 
Martensso.nj^J^ ^ Skoglund, C. R. Contraction properties of the intrinsic 
laryngeal muscle^. .Acta Physiologica Scandinavi ca, 1964, 60 , 318-336. 
Milner-Brown, H. S. , St®in, R. B. , & Yemm, R. The contractile properties of 
human motor units \iuring voluntary isometric contractions. Journal of 
•^V jPhysiology , 1973, 228 , 285-306. ' 

Ohala, J. How is pitch lowered? Journal o.f the Acoustical Society of 
America , 1972, 52, 124. (Abstract)* ’ ” ‘ ~ 

Painter, C. Implosives, inherent pitch, tonogene'sis and laryngeal mechanisms. 

Journal of Phonetics t 1978, 6, 249-274. 1 . ’ * 

Person, R. S., & Kudin^, L. P. Discharge frequency and discharge .pattern of 
human motor units during voluntary contraction of muscles. 

Electroencephalography and Clinical Neurophysiology , 1972,. 32, 471-483. 
Sahgal, V.,^4 Hast, M. H. Histochemistry of primate laryngeal muscles. Acta- 
Otolaryngologica , 1974, 78., 277-281. ' * 

Sawashima, M. Laryngeal research in experimental phonetics. Haskins 

Laboratories Status Report oh ^peech' ' Research , 1970, SR-23 , 69-.1.1 5 . 

[Also in. T. Sebeok (E,d.), Current Trends in lingistjcs (Vol .*12). The 
Hague : Mouton , "1976^. . , . 

Sawashima, M. , v Abramson, A. S», Cooler., F. S„. , 4 Lisker, L. Observing 
laryngeal adjustments during running speech. Phonetic^., 1970', 22, 193- ? 

,201 . • . , . x — ' 

Shipp^xT . , Fishman, B. V., 4 Morrissey, P. -Method — and — eontrol of laryngeal 
EMG electrode placement in man. • Journal ‘of the Acoustical Society of 
America , 1970, 48, 429-430. v • 

Shipp, T., 4 Izcjebski, K. Vocal frequency and vertical larynx positioning by 
singers and -nonsingeto. Journal of the Acoustical Society of America, 
1975, 58, .1104-1106. ; “ • : T 

Shipp, T., 4 McGlone, R. E. Laryngeal dynamics associated with voice frequen- 
cy -change.. Journal of -Speech and Hearing Researc h, 1971, ^4, 761-768. 
Sonninen, A. The role of the external laryngeal muscles in length adjustment 
of £he vocal cords' in singing. Acta Oto-laryngologica , 1956, Suppl. 130 . ' 
Toki zane<"T . , 4 Shimazu, H. ' Funct ional differentiation of * human ‘skeletal 

=Jgscle. Springfield, 111.': Thomas., 1964.' * 

van mn Berg, J. Myoelastic-aerodynamic theory of voice production. J.oUrnal 



of Speech and Hearing Research , 1958, J., 227-244. 

_van den Berg, J. Vocal ligaments versus registers. In Current problems in 
phoniatricsxand logopedics .- Basel: S. Karger, I960, 1_, 1.9-34. ■ . . 

van den Berg, J., 4 Tan, T. S. Results of experiment's with human . larynxes . 

Practica Oto-Rhino-Laryngologica . 1959,’ 21_, 425-450. t 
Yemm, R. The orderly recruitment of motor units of the masseter “and 
temporalis muscles -during voluntary isometric contraction in man. 
Journal of Physiology (London) ,. 1977, 265, 163-174. 



32 



ERIC 



Or 

v 



5 



I 



YoshiokA, H. Laryngeal adjustments during Japanese fricative 1 and devoiced 
vowel production. Haskins Labo ratories Status Report on Speech 
Research , 1979, SR-58 , 147-160. 

Zuniga, E. N., & Simons, D. G. Nonlinear relationship between averaged / 

electromyogram potential and muscle tension in normal subjects. • Archives* 
of Physical Medicine and Rehabilitation . 1969*, *5Q, 613-620. ~ 



f 







*» 






V 








> ' -I ■ 

* 

i 



33 
& % 



/ 



INVESTIGATION OF THE PHONATORY MECHANISM* 

■» * * ' 

Thomas Baer 

« . 

X 

Abstract . A rational approach toward the development of improved 
techniques^ttrr the prevention, detection, diagnosis , .and cpfrection 
of vopal pathologies rests on an improved understanding of voice 
mechanisms. To achieve these ^oals, we need to better understand 
the dimensions of phonatory performance -and their, dependence both on 
the state of laryngeal structures and c«L«j?»ttqrns of control. 
Because of the inaccessible location of the larynx, few direct 
measurements of this performance are possible. Quantitative mathe- 
matical modeling is a useful vehicle for studying laryngeal vocal 
. function. Continuation and extension of excised- larynx and animal 
studies can provide detailed data in support of the developnent and 
testing of these models. Human experiments, _in vivo , aimed at 
-factoring out the phonatory consequences of .variationTTn individual 
laryngeal control parameters are - suggested as a means of further 
extending such studies. » 



. * INTRODUCTION 

A rational approach toward the development of improved techniques for the 
- prevention, detection, diagnosis, and correction of vocal pathologies rests on 
an improved understanding of voice mechanisms. For ppev&ntion,. we hope to' 
undfer stand £he pattern of control, and it's correlates in yibratory perforlT 
mance*, whose breakdown leads to physiological - failures in .• the ' laryngeal 
structures. Our research in detection and diagnosis is directed toward 
^isofcpting non-invasive' multidimensional measures capable of differentiating • ’ 
'• -Performance of larynges w&h different pathologies from the performance of 
normal larynges sftd from each otner . In the area of correction, we hope to 
improve the con ( cqptual frjmework for voice training and therapy, and improve 
the ability of^stifTjfebns ter predict the phonatory consequences' of alternative 
procedures. To achieve these goals, we need to better understand the 

dimensions of phonatory performance and their dependence both on the state a‘f 
laryngeal structures add on patterns of control. - J 

* , fr _ 

• . 

The process of phonation can be separated into three components": a " 

phonatory system, its inputs, and its outputs. The system consists of two 

subsystems: one aerodynamic .(the glottis), and the other, mechanical (the 





*A version .of this paper was presented at the Conference on Assessment of 
Vocal pathology, Bethesda, Md . , April 1 979 . (proceedings to be published in 
ASHA Reports .) 

Acknowledgement . This work was supported by NINCDS Grant NS 13870 and BRS 
Grant RR05596. ' • 

1 ^ 

[HASKINS LABORATORIES: Status Report on Speech Research SR-66 (,198'1)3 



AO ' 



4 



vocal folds). Inputs N to this system are muscular adjustments, transglottal 
-pressure, and some other less significant variables. Ouputs may be considered 
to be the pattern, of mechanical vibrations in the vocal folds, or, more 
significantly for tfoice production, the pattern of airflow into the vocal 
tract. This latter output then 'serves as input to another .system — the vocal 
tract — whos^ output is the radiated voice signal. • ^ 






The rayoelastic-aerodynaraic theory of phonation (van den Berg, 1958) 
accounts grossly for the nature of phonation in terras of a passive interaction 
between the two' phonatory subsystems When an appropriate combination of inputs 
is applied. The acoustic theory of speech (Fant, I960) accounts for the 
effects of the vocal tract in transforming the glottal source signal to a 
radiated acoustic output signal. Although both of these theories have been 
well known for two decades or more, there are significant details that remain 
poorly understood.’ Thus, we have only limited ability to estimate the glottal 
volume velocity waveform by oanceling the effects of the vocal tract from the 
speech output signal, and we have only limited ability to separate the 
influences of inputs to the phonatory system from the influences .of the system 
itself on detail of its output. Because of the inaccessible location of the 
larynx, few-direct measurements of this output are possible. 

Investigations into the mechanisms of phonation and its control have 
relied heavily on research* with models. Much basic knowledge can be derived 
from experiments with excised larynges (e.g., van den Berg & Tan, 1959) and 
with live animal preparations, which serve as simplified models of their 
intact counterparts but which can be more carefully observed and more 
systematically controlled. Fabricated mechanical models have also been used 
to test hypotheses about the mechanism. For example. Smith (1962) experiment- 
ed with a "membrane-cushion" model, which seems to incorporate some elements 
of the more recent "cover-body" theory of Hirano (1974, 1975, 1977). Mostly, 
however , mathematical descriptions and computer simulations have been used to 
formalize and refine knowledge about the mechanisms. Thus, the development of 
these models is both a goal and a tool of phonatory research. 

The history of these modeling efforts parallels the improvement pf our 
understanding of the system. As our understanding has become more complete, 
the models have become more complex. Building on the aerodynamic studies of 
van den Berg, Zantema, and Doornenbal (1957), Flanagan and Landgraf (1968) 
modeled the vocal folds as. a simple msss-spring system performing horizontal 
movements with one - degree of freedom. It soon became apparent that an 
additional degree of freedom was required to account for vertical phase 
differences . Ishizaka and Matsudaira (1972) corrected some errors in van -den 
Berg's aerodynamic analysis, and showed that a two-mass model of -the vocal 
folds could more realistically account for the conditions under which phona- 
tion could be initiated.* Ishizaka and Flanagan ( 1 97^ ) simulated the'two-mas^ 
model, extending the results of- Ishizaka and Matsudaira, but were limited by 
this model's inability to account realistically for the closed period of the 
glottal cycle.* .Titze (1973, 1974) increased the number of masses to 16, in 
order to allow a distribution of vibrations along the anterior-posterior 
direction. This ■■••model also allowed for some vertical movements. Finally, 
Titze and Talkin (1979) have been .investigating more sophisticated models that 
explicitly model the layered structure of £he ,vocal folds (Hirano, 1974) and 
their behavior as a vibrator, and that incorporate tissue viscosity and bulk 
incompressibility’: _ ' 

36 • . * • * . 4 

41 : 



3 

ERIC 



* 



Though it is understood that models must be complex to account realisti- 
cally for the phonatory mechanism, there is also a danger inherent in the 
growth qf complexity. As the number of degrees of freedom and the number of 
independent parameters multiply, the possibilities for accurately modeling the 
detailed mechanism improve, but so do the possibilities for producing appar- 
ently realistic behavior due toi mechanisms that may not represent those of the 
real larynx. For our purposes, models must be mechanistically correct as well 
as descriptive of 'the output. It is therefore essential to determine as many 
of their parameters as possible and the constraints among them by direct 
measurement, and to evaluate the performance of these models in the. greatest 
possible detail. Furthermore , we ought to be able to make directly testable 
prediptipns on the basis of our modeling efforts. 

Further progress in understanding the detailed mechanism of phonation and 
iq developing an accurate model of it thus depends on detailing the mechanical 
character isitics of vocal folds and determining their variation as functions 
°f— laryngeal control. It also depends on improved methods for measuring more 
detailed performance characteristics of real larynges, for comparing model 
performance to the performance of real larynges, . and for" generating testable 
predictions from modeling studies. Hirano has discussed, both at' the 
Conference on Assessment of Vocal Pathology and* in other publications (-Hirano, 
1975 , 1977 ), measurements 1 of" mechanical properties of the vocal folds and some 
patterns of their variation with the contractions of individual muscles. 
Other papers at the conference will discuss techniques for obtaining detailed 
measurements, and Titze's paper will discuss methods for comparing the 
performance bf models with these measurement! on in vivo larynges. In the 
remainder of this paper, the continuation and extension of excised larynx and 
animal studies i£ urged because of their ability to produce detailed data for 
the direct testing of models. Then, some .experiments _in vivo , aimed at 
factoring out the phonatory consequences of variations in individual control 
parameters, are suggested as a means of further extending these studies. 

I. EXPERIMENTS WITH EXCISED LARYNGES AND ^NIMALS ' 

It is well known that ^excised larynges, both canine and human, can 

simulate many of the vibratory characteristics of normal human larynges when 

they are attached to a pseudosubglottal system that supplies suitably conditi- 
oned airflow and when the positions of the laryngeal cartilages are suitably 
controlled, using striftgs to simulate the functions of muscles. , As a 
simplified model of their intact counterparts, excised larynges offer several 
advantages. Because they are more accessible, they can supply observations 
and measurements that cannot be made iji vivo . For example, both Matsushita 
( 196 . 9 ) and Baer ( 1975 ). have developed techniques for observing vibration 
patterns both from the normal supraglottal aspect and from the subglottal 

aspect. Baer also developed a technique for marking the s »vocal folds with 

small particles and tracking their frontal-plane movement trajectories 
throughout a. glottal cycle using a microscope and stroboscopic illumination. 
Measurements could be made from both tjie- 'supraglottal aqd sdbglottal aspects, 
and with the aid of qualitative observations.^ vodal fojd' 'shapes ’in the frontal 
plane throughout a cycle ' could Jbe reconstructed from the measurements . With 
excised larynges, measurements .of* subglottal fJressure and glottal airflow can 
be simplified.' Fuhthermor^ almost 'any technique for measuring character is- 



J 



STROBOSCOPE SYNCHED 
TO OSCILLOSCOPE Tf 



MICROSCOPE 
(z MOTION-ONLY) 





I LITER RESERVOIR 



WARM MOIST AIR 
AT REGULATED FLOW RATE 
* "OR PRESSURE 

MICROPHONE or 

PRESSURE TRANSDUCER- OUTPUT 

TO OSCILLOSCOPE 

/ 

SUBGLOTTAL WINDOW 

NEEDLEPOINT AT 
CENTER OF ROTATION 

ROTARY INDEXING TABLE 
Wy,' AND ROTARY MOTION) *• 









gure 1. Schematic diagram of. apparatus for measuring vibration patterns of 
excised larynges. 



3 

ERIC 



.43 



tics of phonatory vibrations can be used more effectively on an isolated 
. larynx. Additional advantages are that the configuration of an excised larynx 
can be held constant or t systematically varied, that its structures can be 
experimentally modified to determine the effects on vibration, and that they 
are accessible for measurement of mechanical properties in their configuration 
for voice production. The major limitations of the excised preparation — 
namely, that its death changes some of its mechanical properties, including 
its ability to tense the vocalis muscle — can be overcome by using live animal 
preparations and^stimulating the muscles electrically. However, these advan- 
tages have not be&n fully exploited. 

✓ 

Baer's work with excised larynges was directed toward elubidating the 
phonatory mechanism in excised canine larynges. Although there is not space 
here to describe these experiments in detail, some of the most significant 
results are summarized below. 

■* j ) 

The experimental apparatus is shown schematically in Figure 1. .A'larynx 
was mounted on a pseudo-trachea, which made a right-angle turn just below the 
larynx, allowing, a window to obtain a subglottal view. A stroboscope 
synchronized to sub&lottajl? pressure variations was mounted in front of the 

preparation. The phase at which the stroboscope was triggered could be 
adjusted to any point within the glottal cycle. Airflow was delivered at 

regulated flow rate or pressure, and both average 'pressure and average flow 

rate were measured. The subglottal system was intended to annulate the 

acoustic properties of the real subglottal tract. The apparatus was mounted 
orKJthe top of a rotary indexing table, whose tabletop could be rotated, so 
that observations could be made through the microscope at any. an^le. The 
tabletop could also be translated along its two horizontal axes. A measure- 
* ment system was devised by which thdlocations of any points observed through 

the microscope could be determined in three dimensions. 

a 

With respect to gross aspects of the performance of excised larynges, 
observations ^already made by others were replicated. In addition, it was 
observed that, for a given laryngeal configuration, phonation could be 
, maintained at values of subglottal pressure below those required for initiat- 
ing phonation. As the tissues desiccated, the separation between conditions 
for onset and conditions for maintenance increased. Thus, mobility of the 
surface tissues appeared to be important for initiating phonatory vibration. 
Perhaps this observation has some implications for the assessment of patholo- 
■ gies. 

“ f 5 

Figure 2 shows data from a run in which the frontal-plane trajectories of 
three particles were measured at eighth-cycle increments while the larynx' 
sustained steady-state vibration. One particle was on the lateral superior 
surface of the vocal folds, a second was hear the medial superior surface of 
the folds, and a third was on the lower (subglottal) surface. These 
trajectories are typical. They were roughly elliptical, in the clockwise 
direction (for the coordinate system shown). The minor axis of the ellipses 
decreased as average distance from the midline increased. Subglottal parti- 
cles moved primarily in a horizontal direction, while supraglottal particles 
well off the midline moved primarily in a vertical direction. Trajectories of 
particles near the midline often exhibited complex perturbations near the 
v superior-medial parts of their trajectories. Trajectories of the two upper 

X 

59 









O 



A * 

i i 



y ( 1 mm/DIVlSION) 



f 



T 




Figure 2. 



erJcv 



Frontal-plane trajectories of three particles during a single 
glottal cycle. Measurements vere made at eighth cycle increments, 
numbered 0 through Ti^JThe inset to the right of the trajectories 
contains notes about the measurements, .including the angle, e, of 
the tabletop for which each measurement was made. , The schematic 
sketch at the top of the inset indicates the particle locations 
with respect to the margin of the vocal fold . 






45 






40 



particles crossed, so that the particles were nearly vertically aligned during 
one measurement and horizontally aligned during another. Thus, the vibrations 
were complex. Some aspects of the trajectories and of vibrations in general 
were consistent with ' the notion of a displacement wave, progressing up the 
medial surface at a velocity of about 1m/ sec, and then progressing laterally 
, on the superior surface at .3-. 5m/ sec. The supraglottal wave was easily 
observed, as with normal human larynges, and its velocity was measured 
directly. Glottal closure al&> exhibited wavelike properties. Tissues at the 
lower edge of closure were, peeled apart, while tissues above the point of 
closure were still coming together. The depth of closure was often almost 
negligible immediately befoVte the glottis opened. The middle particle in 
Figure 2 appeared to be on Ittte superior part of the vocal folds for part of 
the cycle, and was below thje pqint of closure for part of the closed phase. 
Thus, it is evident that! the '^ration's are complex and cannot be well 
modeled, in detail, as simple translations of a small number of lunped- 
pararaeter masses. \ 

Although some assets oj the vibration patterns seemed best describable 
by surface waves along the cover of the vocal folds, vibrations oirthe. edge 
also appeared to be describable as string vibrations (that is, whole-bo^y 
transition and torsional flexure) . There may have been components of both 
types of vibrations. This interpretation is interesting, because interactions 
between the two types of vibration as a function of variations in control 
p^taeters may help to explain fine control over voice quality variations. 

Detailed shapes of the vocal folds during the eight phase increments in 
Figure 2 were estimated and are shown in Figure 3* A two-mass model 
approximation could be superimposed on these shapes if vertical movements of 
the masses were allowed. Given this approximation, the aerodynamic theory of 
Ishizaka and Matsudaira (1972) was capable of reconciling average subglottal 
pressure with average flow rate.. It was also shown, as expected, that the*? 
aerodynamic model provided for the efficient transfer of energy from the 
aerodynamic system to the mechanical system (StiSv.ens, 1977), given the nature 
of vertical phase differences. The mechanical parts of the v two-mass model did 
not well account for these- data, hdwever . -Thus, to the extent it could be 
tested, the aerodynamic aspect o^ the two-mass model seemed accurate, but the 
mechanical part of the ‘r&odel seemed inadequate. 

- V 

A change in particle trajectories was observed as the tissues desiccated 
and vibrations eventually ceased. These and other measurements suggested that 
particle trajectories could be considered as oscillations, around an unstable 
equilibrium position. This result implies that small-signal modeling techni- 
ques, such as' those of Ishizaka and Matsudaira (1972), which account for voice 
onset by finding unstable solutions to linear equations, are justified. "> 

Excised larynges were able to produce nearly normal vibrations even when 
the • vocalis- muscle on one, or .both sides_ was completely removed. However, 
these preparations did not seem capable of falsetto vibrations. Wave motions 
with vel.ocity similar to that of the normal case were still seen to propagate 
upward on the medial wall. Particle trajectories were somewhat similar to the 
normal case, although, they differed ^in some details. These observations 
shbuld be especially useful for testing models that account for the layered 
structure of the vocal folds. , 



/ 



The experiments described above illustrate the potential value of devel- 
oping a model specifically for excised larynges, as a step in developing a', 
model for the i n vivo case. An advantage to modeling the excised preparation 
explicitly is not only its versatility, as illustrated by the experiments wj^th 
excised vocalis muscles , but also the fact that measurements of mechanical 

properties can be made on the same preparation on which the vibration patterns 
are measured . . f T — 

Optical techniques for measuring frontal plane vibration patterns, such 
as. those used bv Baer , are limited because they are time consuming and because 
only 'vibrations of the vocal fold surfaces can be measured.. Radiographic 
techniques may provide a solution to the problem of measuring vocal fold 
shapes throughout a cycle. There have been some radiographic studies of vocal 
fold vibrations in vivo . Sovak, Cdurtois, Haas, and ariith ( 1 97 1 ) described a 
high-speed radiographic study capable of resolving the .details of a glottal 
cycle. Hollien, Coleman, and Moore* (1968) developed the technique of 
stroboscopic laminagraphy\ in which an x-ray source^is pulsed stroboscopically 
during a laminagraphib ''proc edur e . For steady phonation, images of a frontal 
section could thus be obtained at successive phases within a cycle. The 
usefulness of these studies was limited by! the poor quality of the images 
obtained. Furthermore, they may be no longer practical, in view of modern 
concerns about radiographic dosage, especially to the thyroid gland. However, 
suci techniques could be applied safely and more effectively to the study of 
excised or^ animal larynges. A promising improvement on these techniques was 
recently described by Saito (1977) and Saito, Fukuda, iOno, and Isogai ( 1 978 ) . 
Small lead pellets were affixed to the vocal fold surfaces and-.also implanted 
within the vocal folds, so that both internal and external vibrations .could be 
monitored. ‘Stroboscopic radiography, synchronized to the voice, was then used 
to track the movements of these particles throughout cycles of vibration. * 
Such measurements might be made even more effectively with, a computer- 
controlled x-ray microbeam systgra (Fujimura, Kiritan'i, & Ishida, 1.97<3 ; Kirita- 
ni » 1977), if its detector output were stroboscopically sampled or its source 
stroboscopically pulsed, bedause of the improved spatial Resolution'' of this 
device. Conceivably, radiopaque medium could- be introduced • thrqugh the 
Lrculatory system, as a further improvement of this technique. , v . . 



II. MEASUREMENTS IN VIVO; RESPONSES TO INDIVIDUAL CONTROL VARIABLES ' •' 

* * • 

There are many parameters controlling phonatipn' in the normal .human 
larynx. 'Control is exerted most directly through the effects bf t*he intrinsic 
rausdles on laryngeal configuration and through transglottal pressure^. Forces 
exerted by the extrinsic laryngeal muscles and other extrinsic 'structures al-so 
have -an effec.t, Acoustic load can modify the patterns of airflow through the 
glottis and probably the mechanical vibrations as well. There are probably 
other effects, such as contol of vascular and mucous supply, which are less 
well understood. During voluntary control of phonat'ion, variations in several 
of these parameters are int^rcorrelate* (see, for example, Atkinson, 1978). 
Although such variables as the levels of electromyographic activity in 
individual mudcles and subglottal pressure can be correlated with correspond- 
ing ohanges in fundamental frequency or otfiSr aspects of phonafcory perfor- 
mance, correlation does not guarantee causality, because of the intercorreia- 
tions among control variables.^ , Therefore,' it has been difficult to isolate 




r 



43 



0 



t 



T 



the detailed phonatory response to any one of them. Nevertheless ^ these, 
detailed effects must be known in order to determine the relevance of data 
from excised larynx 1 and animal experiments, to adeqi^tely test detailed 
phonatory. models, and, in general, to fully understand, phonatory function. 

f 

One method for isolating the effects of a given parameter is to 
externally apply involuntary perturbations and observe the phonatory response 
while other parameters remain constant. This technique has been most success- 
fully used for %xaminffig the effects ^f changes in subglottal pressure on 
fundamental frequency. Several experiments he^v* been reported in "which 
X subglottal pressure' is increased by a sudden push on the chest or abdomen of a 
Iphonating subject, and both subglottal pressure and fundamental frequency are 
monitored during an interval for which no muscular response is' assumed to 
' •occur (for example,, van den Berg,' 1957; Isshiki, 1959; Ladefoged, 1963; Ohraan 
& Lindqvist, 1966; Fromkin Ohala , 1968). This experiment was recently 
replicated by Beer ( 1979 ) r who also monitored the electromyographic activity 
of laryngeal muscles to ensure the absence €ff a response. Transglottal 
pressure can also be varied supraglottally v through modulation* of intraoral 
pressure (Lieberman, Knud son, & Mead, 1 96^3; Hixon, JCLatt, & Mead; 1971; 
Rothenberg & Mahshie , 1977). When pressure modulations are oscillatory, at 
frequencies of about *6-1 0Hz, continuous muscular compensation does pot seem to 
occur, although EMG evidence to support this claim has not been published. 



/ Although results of these induced-pressure-change experiments differ in 
some^details, their consensus indicates . that fundamental frequency varies with 
transglottal pressure ^t rates of about 3~5Hzcm withi'h the speech 't'&nge, 
with, higher rates at higher fundamental frequencies ° or in falsetto Register. 
These » results , as well as correlation between fundamental frequency and 
subglottal pressure during voluntary control (Atkinson, 1978), suggest that 
the phonatory response to pressure change is fast, perhaps within the interval 
of one or two glottal periods. ; 

- * ^ " 

^ " < Cr 

The effects of involuntary perturbations in acoustic load on fundamental 
frequency have also been investigated through systematic variation in the 
length of a tube that artificially extends the vocal tract Jlshizaka, 



Matsudaira, & Takashima, 1968;” 



shizaka & Flanagan, 



1972). 



Changes in 

fundamental frequency of asjntfch 'as-,'20Hz w ^i&k btained by varying^the length 
of the tube. JHowever^^it was not deten^'idEr in these experiments “whether 
there -was any compensatory laryngeal responpJJr It is easily dhown that such 
artificially ijjefeased aqoustic loads can have an effect on phonation./ If one 
phonates an^ascending scale. into an artificially extended vocal tract K such as 
a maii-ing tube), the voice will typically break or switch to„ f alsetteJwhen the 
^furiciam entail frequency nears the first resonance frequency of the tract; A 
lower order manifestation of' this phenomenon 'might account for the intrinsic 
pitch of vowels (Peterson &' Barney, 1952). In' any case,' such experiments 
could be repeated more carefully to further constrain- the performance of 
phonatory models., • . . ... 






The logical counterpart to these studies for quantifying the effects of 
individual muscles on phonatory performance would probably require electrical 
stimulation of the muscles. There are no accounts of any such studies on 
normal human subjects, and it is unclear whether stimulation experiments are 
possible in practice. However, an alternative methbd, yhich isolates the 

..•/< i'J 



o 

.ERIC 



x 



effects of single-motor-unit contractions;. hag recently been used by ‘Baer 
(1978) for investigating the effects of individual .^muscles on fundamental 
frequency. Rather than analyzing gross aspects of fundamental frequency 
control, this method" relates Very small changes, in fundamental frequency 
_ (namely, pitch perturbations) to very small changes in muscle tension, which 
can be related to single-motor-dnit 'activity. Statistical ind^pen^encfe 
between motor-unit inputs can then be exploited* to uncorrelate the muscles, 
and examine their individual causal effects on fundamental frequency. 

^ ♦ * 

This method, extends the use of an averaging technique that yas * first 
developed for studying properties of single motor units in skeletal muscles 
(Milner^Brown, Stein, & Yemm, 1973).* Single-motor-unit action potentials (see 
Harris, 1981) must be identified in ’an electromyographic recording .while; the 
muscle sustains a contraction. A , simplified muscle model, .which is approxi- 
mately valid at low to moderate levels of .contraction, is assumed. This model 
is shown in Figure 4. -Its inputs are 'the action potential trains from 
individual ,motojieurq.hs . Each jof these \can be considered a 'random point 
process, and^they are statistically independent across units.' Each motor-unit 
action potential triggers a mechanical twitch — a " positive \ pulse of tension 
whose detailed characteristic^ vary across motor units.', At least some of 
these units fire at low. enough hates .so that adjacent twitches do not overlap. 
The output tension of the whole muscle is "’the (.summation of its constituent 
motor unit outputs. Although many of £Ti.e motor* un’l^chjt puts are trains of 
pulses, „they sum to an approximately constant," thougl$£no\sy , value because 
^they are ^qta tfoticaSlv independent. 'The relative amprffoSe of thisvnoise 
^ends on^^^nu^iber. bi* mo'torhunats and their, firing rates; .. * , • 



V - 



s > 'A 



§ — Given^the- model in., Figure 4.., the contribution of a *single motor unit to 

■£he output tension, (its contraction properties) qan be estimated if its input 
action potentials .can- beVidenti fled and if these inputs, are isolated by 
intervals great enough to eijsyrevagainst overlap of adjacent contractions. 
Samples of the output* tension /wavefdjgmj. fallowing the input.%. are aligned and 
averaged. The output ^of the isolated fiot£r t units is aTteeys the same within 
these intervals, while” the outputs oft- hll othar motqr^ ^un’i.ts are random and 
thus average to a constant value./ ‘ \ V .-4 * \ 



% 



<>■ 



To apply this technique ,to investigation * of /fundamental- frequency ’-con-* 
trol* we note that motor-unit firings* are statistically"? independent adhoss 
muscles as well as within a muscle^ We then hypothesize "that muscle-tension 
variability contributes to the fundamental fVequ^ncy^perturbdtions that can'll 
measured when a normal phonating ^subject attempts to^usfcain a steady tone. 
The resulting model for pitch perturbations is they * indicated in Figure 5. 
‘Laryngeal muscles produce roughly constant output , "tensions ( that are noisy 
because of single-unit' effects. The noise com points’* .across muscles are 
uncorrelated. • The .complex, effect of muscle fterces.'on the. Vocal folds, which • 
we have lumped under the ; term, "vocal '•old tension *a|jso roughly contetant, 
but noisy. Output fundamental * frequency then dep^gds^mi this tension and 
other independent inputs such as subglottal pressureNIgd , perhaps, mucosity ’ 
ahd other random* effects. All ’the detailed inputs to this model are thus 
- statistically independent. According to the model, ,thpn, fundamental frequen- 
cy as a function of time can be treated .as an output "and be* averaged just as 
muscle tension in earlier studies to estimate the effects of single-motor-unit 
contractions in that muscle. The effects 'of other muscles and other inputs 
• average to a constant value. 

45 ' 



V 



O 

ERIC 



i*U 






/ 

SIMPLIFIED MUSCLE MODEL 




Figure 4. Simplified model of a muscle during a sustained contraction. 







) 



* 



MODEL FOR 
PITCH' PERTURBATIONS 




Figure 5. Model for pitch perturbations during produc|fl.on of a steady tone. 



ERIC 



so 



To obtain data for such a study, a subject is asked to sustain a steady 
tone for several breaths. Electromyographic (EMG)' activity, obtained “through 
hooked wire electrodes from a laryngeal muscle under 'study, and the voice 
signal obtained- through a standard microphone pre recorded and input to a 
digital computer . After instantaneous fundamental frequency as a function dt 
time is derived, this waveform is offset by approximately its average value 
and amplified to exaggerate the perturbations. Isolated single-raotor-unit 
firings are identified in the EMG waveform. Then, samples qf the EMG waveform 

- and the F 0 perturbation waveform are aligned around the single firings and 
averaged. The sample window extends from 100ms before to 300ras after these 
firings. 

Figure 6 shows a 1.5s sample of data when the raygcle un.der study was the 
cricdthyroid , whose function as a vocal— fold tenser and li^fnce as a pitch 
raiser is well known. Fundamental frequency was about 100Hz, -which i-s in the 
lower part of the subject's range, in order to keep the number* of recruited 
units and their firing rates. low." As this figure shows, fundamental frequency 
was estimated to 1 Hz resolution. Although c.ycle-to— cycle variations i^arely 
exceed 1Hz, perturbations over larger time intervals were about 4Hz wide.. Two 

- firings havfe been isolated in this record, and the corresponding sample 
intervals are indicated by horizontal lines. 



.Figure 7^ shows the results of the. averaging, 'calculation for this’ 
experiment after 19 suitable firings were identified/ The upper panel shows 
the averaged DIG signal, which exTiibits a pulse only at the lineup point, as 
expected. The lower panel shows the average* F 0 perturbation. This signal is 
*■ approximately at baseline both to the left of the lineup point and to the far 
Tight of the window. However, there is a positive pulse beginning immediately 
after the lineup point. This pulse reaches its peak -amplitude of 1Hz at a 
latency of about 70-80ms. The pulse appears to indicate that the single-raotor- 
unit contraction caused, on the average, a 1Hz increase in fundanental 
frequency. 



A similar calculation was performed for one of the strap muscles, an 
extrinsic laryngeal muscle whose -possible function^ in lowering F 0 has been a 
source of some controversy. When fundamental frequency iras in the middle of 
the subject's range, no systematic .effect was found. Results wh® the 
fundamental frequency was low are shown in Figure 8*. . Although these iata are 
somewhat noisier than those in Figure 7, they appear .to exhibit anegative 
pulse in the interval immediately after the lineup ^oint. Thus, the strap 
muscle is shown to have a causal effect in lowering- fundamental^frequency from 
an already low level. , ^ 



i °f a ”“f ular contribution to F 0 perturbations is itself 

interesting, since perturbations have been used as an indicator of vocal 
pathology. These results show that care must .be taken when interpreting 
patterns of pertuphation . More relevant to this discussion, however, is the 
fact that we cap/ show the response to a. .short duration pulse of tension in a 
single musclp'', . and that these data can thus be used to constrain the 
performance of laryngeal models. it was noted that the average pitch 
perturbation for the cricothyroid muscle beg4ns immediately after the lineup 
point. This- shows that the phonatory response must begin within one glottal 
cyple. The latency of the peak of the response,- 70-80ras, includes contribu- 



l 



48 



£3 



T 






> * 







. Figure 6. Short segment of data duriny production of a steady tone at about 
100 Hz. Top: voice “ waveform; Middle; EMG activity of the 

cricothyroid muscle; Bottom:, "instantaneous fundamental frequency" 
extracted from th^e voice waveform, tfwo sets of horizontal lines 
indicat^^intervals from 100 ms before to 300 ms after single-motor- 
unit firings in 'the' cricothyroid muscle. 







49 



4 






CRICOTHYROID MUSCLE 

N = 18 

ft = 166 HZ 

*> 






RAM EMC* ALIGNED AT SINGLE FIRINGS AND AVERAGED 




* t 



Figure 7. Ensemble- aver age waveforms of ^MG activity from the cricothyroid 
* muscle and corresponding instantaneous fundap-ental frequency. All 

waveforms haver been aligned at the time of a single-motor-unit 
firing for purposes of averaging, •- - , - . * v 



50 “ 



N 

/ 



>o 



o 

ERIC 







STRAP MUSCLE 
N 3 25 
ft = 88 HZ 



Ip*** 



RAM EMG: ALIGNED AT SINGLE FIRINGS AND AVERAGED 



-It# NEC 



LINEUP POINT 



♦lit MEC 



PITCH PERTURBATIONS (AROUND Fi>» * 
> ALIGNED AS ABOVE AND 



averaged 





V 



Figure 8. Ensemble-average waveforms of EMG* activity from an. '•.unspecified 
strap*muscle and corresponding instantaneous fundamental frequency.- 
“All .waveforms have been aligned at the ^ime of- a singie-motor-unit 
firing for purposes of averaging. ' * 



O 

ERIC 



v U 



, \ 




tions due to muscle contraction time, mechanical response latency in the 
larynx, and latericy of phonatory response. Since both the latency and the 
amplitude of the mechanical motor-unit contractions can be. estimated in animal/ 
experiments, these data might be further applied to *the detailed testing/)? 
models of laryngeal performance, especially in comparison with data reported 
by Hirano (1975) relating' changes in shape and mechanical properties of vocal 
folds to stimulation of yaricwfs muscles. -These, data might also shed some 
further light on the pattern of motor control. For example, the relatively 

large amplitude of the F« perturbation pulse in Figure 7 relative to the 
overall perturbation in Figure. 6 suggests that very few motor units were 
firing at rates low enough to show the effects of " individual twitches. 
However, it is unclear how many 'other units may have been in tetanus. Perhaps 
the greatest value of the single-unit technique will be in elucidating the 
phonatory fuhction.of muscles such as the vocalis, whose gross patterns of 
activity are so intercorrelated with those of other muscles during ongoing 
regulation of phonation that their detailed effects have remained obscure. 



In considering the function of individual control parameters, in this 
section, we have only discussed measurements of their effects on fundamental 
frequency. -The reason -for this is that, with few exceptions' these are the 
only measurements that have ’been made. Fundamental frequency by itself,* 
however , is evidently not a very complete descriptor of phonatory., activity. 
As fundamental frequency is varied, attributes of the vocal source waveform 
that contribute to intensity and. voice quality also vary. It is important to 
determine how these parameters covary when changes are produced by different 
control mechanisms, and, for purposes of assessing vocal pathology, how these 
relationships change in different pathological states. 



Techniques to be^discussed in today's session can be used to measure some 
of these different parameters of phonatory performance, such as amplitude of 
the glottal pulse and open quotient. When these parameters are-measured 
cycle- to-cycle , the same techniques described in the section- for studying 
fundamental frequency control can be utilised to assess the effects of 
different control parameters. These data, .together with such anatomical and 
physical studies as those reported by Hirano (1975) , are needed to improve our 
understanding of’ the phonatory mechanism and constrain the performance of 
mechanistic, models. Thus, these studies should be pursued. Furthermore, if 
it were possible, it would be even more, useful to study 4 not only changes in 
vibratory performance characteristics as a function of these control parame- 
ters, but also intermpdiaier^variables such as the positions of the larynge'al 
structures and their mechanical-'properties . However, these experiments must 
await the development of techniques for measuring these parameters. 



Finally, further insights are deeded into the detailed conditions neces- 
sary for- initiating and sustaining phonation, as well as, fi|r regulating 
ongoing phonation. Jin example of how. such studies might be pertormed in vivo 
is by using involuntary perturbations of subglottal pressure. Fo^ example, a 
subject might be asked to assume a configuration appropriate for voicing but 
to maintain subglottal pressure at a level below the threshold for voice 
on^et • Transglottal pressure might then be suddenly increased, say using a 
chest push procedure, td ‘a level for which phonatory vibrations ar'e initiated, 
while laryngeal configuration remains constant. Conditions for voice onset 
could then.be determined, in terra b of the level of subglottal pressure 






function of variations in the configuration. With negative transglottal 
pressure perturbations, conditions for voice offset could also be studied. 



Atkinson, J. E. Correlation analysis of the physiological factors contrciLli'ng 
fundamental voice frequency. Journal of the Acoustical .(Society of. 
America , 1978, 63, 211-222. . 1 

Baer, T. ^ Investigation of phonation using excised larynxes. Unpublished 
doctoral 'dissertation, Massachusetts Institute of Technology, 1 975 . 

Baer, T. Effect of single-motor-unit firings on fundamental frequency of 
phonation. Journal of the Acoustical Socie ty of America, 1978, 64, S90. 
(Abstract) 

Baer, T. Reflex activation of^laryngeal muscles by sudden induced subglottal 
pressure changes. Jotpmal of the Acoustical Society of America, 1979, 
65, 1271-1275. ~ 

fant, G. Acoustic theory* of speech production .’ ' s-Gravenhage: Mouton , I960. 

Flanagan, J. L., & Landgraf, L. Self— oscillatory source for vocal— tract 
synthesizers. IEEE Transactions on Audio and Electoacoustics. 1 q68 AU- 
16, 57-64. : ’ — 

From kin , > V. , & Ohala, J, Laryngeal control and a model of speech production. 
UCLA Working Papers in Phonetics , 1 968 , _1_0, 98-110. 

Fujimura, 0., Kiritani, S., & Ishida, H. Computer controlled radiography for 
observation of movements of articulatory and other human organs. 
Computers in Biology and Medicine , 1973, 3, 371-384. 

Harris, K. S. Electromyography as a technique for laryngeal investigation. 
Haskins Laboratories Status Report on Speech Research , 1 98 1 , SR-66, this 



Hirano, M. Morphological structure of the vocal cord as a vibrator and its 
variations. Folia Phoniatrica , 1 974 , 26 ,' 89-94 . 

Hirano, - M. Phonosurgery : Basic and clinical investigations. Otologia 

Fukuoka , 1975, 21 _, 239-440. . 

Hirano, M.- Structure ana vibratory behavior - of the vocal folds. In 
M. Sawashima & F. S. Cooper (Eds.), Dynamic asp ects of speech production. 
Tokyo: University of Tokyo Press, 1977, 13-27. 

Hixon , T. J., Klatt , D. H., & Mead, J. Influence of forced transglottal 

pressure changes on vocal fundamental frequency. Journal of the 
Acoustical Society of America , 1971,49, 105. (Abstract) 

Hollien, H., Coleman, R., & Moore, P. Stroboscopic laminagraphy of the larynx 
during phonation. Acta Oto-=laryogologica , 1968, 65 ., 209-215. 

Ishizaka, K., & Flanagan, J. L. Synthesis of voiced sound from a two-mass, 
model Of the vocal cords. Bell System Technical Journal. 1072 SI 1333- 
1268. . — ’ 

Ishizaka, K., & Matsudaira, M. FLuid mechanical considerations-.of vocal cord 
vibration. SCRL Monograph (Speech Communication Research .Laboratory, 
Santa Barbara), 1972, No. 8. 

Ishizaka, K. , Matsudaira, M^, & Takashima, M. Variations of’ the vocal pitch 
with the driving * point impedance. Journal of the Acoustical Society Of 
Japan ,. 1968, 24, 313-314. * 

Isshiki, N. . Regulatory mechanism of the pitch and volume of voice. Oto-Rhino- 
‘ Laryngology (Kyotqd , 1959, ;52, 1065-109*. 7 

Kiritani, S. Articulatory studies by the x-ray microbeam system. In * 



REFERENCES 



/ 



volume 



53 




M. Sawashima & F. jS. Cooper (Eds.), Dynamic aspects of speech production. 
Tokyo: University of Tokyo Press, 1977, *171-1 9(J. 

Ladefoged, P. Some physiological parameters in speech. Language and. Speech. 

1963, 6, 109-119. ■ 

Lieberman, P., Knudson, R., & Mead, J. Determination of the rate of change of 
fundamental frequency with subglottaT air pressure during sustained 
phonation. Journal of the Aco'us'tic/Sl Society of America, 1969, 45.’ 1537-' 
1543. 7 — 

Matsushita, H. Vocal cord vibration of excised larynges — A study with ultra- 
high speed cinematography. Otologia Fukuoka , 1 969 , V5 , 127-142. 
Milner-Brown, H. S., Stein, R. B., & Yemm, R, The contractile properties of 
human motor units -during voluntary isometric contractions. Journal of 
Physiology , 1 97 3 , 22 8, 285-306. „ 

Ohm an ,, S., & Lindqvist, J. Analysis— by— synthesis of prosodic pitch contours. 
Quarterly Progress and Status Report (Speech Transmission Laboratory, 
Royal Institute of Technology, Stockholm), 1966, STL-QPSR 1/1 966 , 1-6. 

, Peterson, G. E., & Barney, H. L. .Control methods used in a study of the 
vowels. Journal of the Acoustical Society of America , 1952, 24, 175-184. 
Rothenberg, M., & Mahshie, J. Induced transglottal pressure variations- during 
voicing. Journal of the Acoustical Society of America, 1977, 62. S14. 
(Abstract) j 

Saito, S. Phonosurgery. Otologia - Fukuoka , 1977, 23, 171-384. 

Saito, S., Fukuda, H., Ono, H., & Isogai, Y. Observation of vocal fold 
vibration by x-ray stroboscopy. Journal of the Acoustical Society of 
America , 1978, 64, -S90-91 . (Abstractk 

Smith, S. On artificial voice production. In A. Sovijarvi & P. Aalto (*Eds.), 
Proceedings of the 4th international Congress of Phonetic Sciences. The 
.Hague: Mouton , -.1962 , 96-110. 

Sovak, M. , Courtois, -J., Haas, C. , & Smith, S. Observations on the mechanism 
* of phonation investigated by ultraspeed cinefluorography. Folia 
Phoniatrica . 1971, 23, 277-287. - - ’ 

Stevens, K. N. Physics of laryngeal behavior and larynx modes. Phonetica, 

J 977 , 34, 264-279. „ 

Titze,, I. R. The human vocal cords: A mathematical model. Part 1. 

• Phonetica , 1 97 3 , 28, 129-170. 

Titze, I. R. The human vocal cords: A mathematical model. Part 2. 

Phonetica , 1974, 29, 1-21. 

Titze, I. R., & Talkin, D. A theoretical study of the effects of various 
. laryngeal configurations on the acoustics of phonation. Journal of the 
Acoustical Society of America , 1979, 66^ 60-74. 

van den -Berg, Jw. Sub-glottal pressure and vibrations of the vocal folds. 

_ Folia Phoniatrica , 1957, j), 65-71. 

van den Berg, Jw. Myoelastic-aerodynamic theory of voice production. Journal 
, of Speech and Hearing Research , 1958, J_, 227-244. 
van den Berg, .Jw., & Tan,' T._ S. Results of experiments wit^* human larynges. 

Practica Oto-Rhino-Laryngology . 1959, ’21_, 425-450.- 
van den Berg, Jw., Zantema, J. T. , '& Doornenbal, P, On the air resistance and 
* the Bernoulli effect of the human larynx. -Journal of .the Acoustical 
Society of America , 1957 , 29, 626-631. 



■ 



N 






o 

ERIC 



PHONETIC PERCEPTION OF SINUSOIDAL SIGNALS: EFFECTS OF AMPLITUDE VARIATION* 



Robert E. Remez,+ Philip E. Rubin, and Thomas D. Carrell++ 






Abstract . Naive subjects, when instructed to listen for a sentence, 
are capable of transcribing -the phonetic message of acoustic signals 
consisting sol*ely of time-varying sinusoids.' These unnatural- 
sounding signals mimic the pattern : of formant center-frequency and 
amplitude variation over the course of polysyllabic,, semantically 
normal utterances. To what extent does amplitude variation over 
time contribute to intelligibility? Our' present investigation 
tested the hypothesis that listeners derive, some information a^out 
syllable patterns from amplitude Variation alone, and may therefore 
use contextual constraints to deduce prosodically appropriate 
portions of the “message in the tonal stimulus. Phonetic and 
syllabic intelligibilitjjlrwere compared in four conditions: (1) 

normal amplitude and frequency variation; (2) normal frequency, 
variation with constant amplitude; (3) normal frequency variation 
with a misleading amplitude, contour; and (4) normal amplitude 
variation with no frequency variationV- These results are discussed 
in the ’framework of phonetic perception and in terirfs of current 
theories of the. perception of • fluent, speech. 



Talkers make sounds for listeners to hear. This truism has implicitly 
motivated many Jpresent explanations of speech perception. Essentially, these 
.explanations have sought . to enumerate the perceptually critical acoustic 
elements produced by talkers when generating phonetic sequences. Researchers 
have- ysed the ability to synthesize speech to fashion acoustic signals 
containing only those acoustic components of natural utterances believed to be 
necessary for perception. In doing so, we t have made highly refined and 
specific descriptions of the stimuli that elicit phonetic .perception. In 
complementary research, studies of the 'auditory periphery, of the basilar 
membrane, cochlear nucleus and auditory- projection have permitted. us to learn 
how the critical acoustic elements survive auditory transmission. But, 

♦ 1 ’ ’ l 



•Paper presented at the 101st Meeting of the Acoustical Society of America, 
Ottowa, Ontario, Canada, May 22, 1981 . »*• 



+Department of Psychology, Barnard College, Columbia University, New York, 
New York. ” » 

++Department of Psychology, Indiana University, Bloomington, Indiana. 

_ Acknowledgement . For helping us conceptually, we' thank Franklin Cooper, 
Alvin Liberman, David Pisoni, .Brad Rakerd, and Michael Studdert-Kennedy . 
This research is supported by a grant from Sigma Xi to Robert E. Remez, 
Grant HD 01994 from the National Institute of Child Health and Human 
'Development to Haskins Laboratories, and Grant MH 24027 from the National . 
Institute of Mental Health to David B. Pisoni. 



[HASKINS 'LABORATORIES: Status Report on Speech Research SR-66 (1981 )] 



55 



b'J 



K 






regardless Sf the differences among the many approaches to studying phonetic 
perception, all, approaches have assumed that the stimuli for phoneti<j percep- 
tion consist necessarily of the kinds' of sounds produced by v a variably 
excitable, variably shapable tube-resonator — the vocal tract. 1. "• 

A recent demonstration of ours questioned the assumption that the 
perceiver re<fuires ( phonetic stimuli to comprise, however selectively, acoustic 
elements found in natural utterances (Remez, Rubin, Pisoni,' & Carrell, 1981). 
In a raising this question, our study also* challenged .the ’assumption that 
phonetic perception is based simply on a succession of discrete acoustic 
’elements. In this study, we used a signal •consisting of three time^varying 
sinusoids, each of which varied in a way that a formant peak jnight vary over 
, f. thd course of an utterance. Initially we fabricated the Sinusoidal pattern by 
V<5omputing the resonant centeri-frequencies of a natural utterance, using Linear 
►.Predictive Coding (see Figure 1). The table of values produced through this 
analysis was used to set frequency and amplitude parameters of a sine-wave 
synthesizer. Figure 2 shows the differing short-time courier spectra of 
natural* synthetic (OVE and Haskins Pattern Playback), and -^ine-wave". signals. 
Note the absence of a fundamental frequency, harmonic speoAum, and broadband 
formants in the sinewave signal. Lacking" these acousti™ attributes, the 
sinewave spectrum does not resemble the spectrum of a natural signal, in any 
literal sense. However, there ^^s energy, albeit infinitely narrowband, at the 
computed peaks throughout the duration of the pattern; and, the time-varying 
properties of the sinewave pattern, specifically the coherence of the changes 
of the energy peaks over time, replicate the natural case. 

' 4 

The perceptual effects of sinewave stimuli were easy to predict.* Because 
the short-time spectra of three-tone signals differ drastically from natural 
and even synthetic s’peech; because no talker is capable of producing three 
simultaneous "whistles" with these bandwidths, in this frequency range; and 
because the < frequency and amplitude variation of the three tones is not 
synchronized, the perceiver should hear three independent stress, one for 
qach sinusoid. The perceiver should hear no phonetic qualities. 

< 

However straightforward *this prediction seems, ther^" was a second, 
contrasting prediction. Suppose that the listener is able to disregard the 
short-time differences between sinusoidal signals and speech, and can. attend, 
instead, to the overall pattern of change of the three tones. The patten of 
change of the frequency peaks resembles the resonance changes produced by a 
vocal tract articulating speech. If the listener can apprehend this coherence 
in the time-varying properties of the ri'onspeech signal, then he should hear a 

phonetic message spoken by an impossible voice. - 

<* 

Given nonspeech stimuli whose time-varying properties are abstractly 
vocal, listeners perceived the signals in both of the ways we predicted. 
Those listeners who were told nothing about the stimuli heard science fiction 
sounds, bad electronic music, sirens, computer bleeps and radio interference .2 
Those listeners who instead were instructed to transcribe a "strangely 
synthesized English sentence" did exactly that, for the most part — they 
identified the radically unnatural "voice" quality .of the patterns, but they 
transcribed those patterns as they would have the original natural utterances 
upon- which we based our sinewave stimuli. 

.61 



* O 

ERIC 



56 



> 



* 



SINEWAVE SYNTHESIS SIMULATION 
7 , 

OF A NATURALLY PRODUCED UTTERANCE' 



s 






NATURALLY .PRODUCED UTTERANCE | 


, 


* 


I - DIGITIZATION * | 






•lPc ANALYSIS 
WITH PEAK-PICK-ING 


V * | 


W 

. * 


| FORMANT CENTER FREQUENCES ' | 




; ~ — r. 


'conversion to 

SINEWAVE SYNTHESIS INPUT VALUES 


1 


[ 


HAND CORRECTION 
OF FREQUENCY VALUES- / 






’ SINEWAVE SYNTHESIS | 


^ \ 


* 


[ - DIGITIZED WAVEFORM ’ 


4 




| CONVERSION TO AUDIO | 



J * 



Figure 1. Sinewave stimuli are produced by imitating the time-varying proper- 
ties of the center frequency and amplitude of the first three 
formants in a natural utterance. 




PO 



57 



T 



s « 

* 



*• f 



.'T ■ 



r 



’ > - \ 
T* 



FREQUENCY 



• FOURIER SPECTRA 

* B. 




• NATURAL 



Playback 



ove 




D. 



J 









4- 



M 






SINEWAVE 



& 



i. ^ 



' T 



58 



Figure 2. A comparison of the Fourier spectrun of four complex waveforms^ 
(A) natural speech; (B) synthetic speech .produced by the OVE 
synthesizer; (C) synthetic dpeech produced by the Haskins I^bs 
Pattern Playback; (D) waveform consisting of three sinusoids.. 



ERIC 



63 





V 



• This finding was novel i'n at least two wa^B. (1) It extended research on 
phonetic perception of sinusoidal signals to a high uncertainty judgment task, 
by offering unrestricted response alternatives. Previous tests of sinusoidal 
patterns had used forced-choice identification tasks wi-th small response sets 
(Bailey, Summferfield, & Dorman, 1977; Best, ^Morrongiello, & Robson, 1981; 
Cutting, 1974; Fant, 1959; Grunke & Pisoni,. 1979). Subjects' performance is 
obviously stabilize d ^i n such circumstances. However, we showed that the 
intelligibility of sinusoids does not depend on extensive training with 
simple, schematic -stimuli, nor on test procedures that intrinsically promote 

consistent performance. . * 

, • , * 

(2) More generally, the study indicated that speech perception is 
possible despite drastic departures from the short-time spectra of natural 
speech — despite absence of broadband formants, harmonic spectrum, and funda- 
mental frequency — insofar as the 1?ltne-varying properties of speech signals ahe 
preserved; and, insofar as the listener is able to attend to the coherent 
time- variation of the acoustic pattern. Both of these general qualifications 
must obtain for phonetic perception of sinusoids to occur, for the listeners 
who were not directed to expefet speech for the most part did not spontaneously 
hear phonetic sequences in the tones. » . 



The present investigation is directed toward questions that arose from 
our initial research with perception of sinusoidal replicas of fluent, 

1 semantically ordinary utterances. Primarily,' we. noted .that the tonal’ patterns 
could well' be qpnsidered an extreme case_, of defective acoustic-phonetic 
stimuli. If this description were apt, then the perceptual process could- be 
described more conventionally, in quite different terms. Listeners might 
merely have memorized the tune of J&he tones without any phonetic recognition; 
and, after inferring^a prosodic schema from the amplitude contour preserved in 
the tonal pattern, listeners would then have been free to guess (or, rather, 
to hypothesize ) . a likely phonetic sequence for the utterance using "top-down" 
finesse. A. number of views of the perception of fluent speech include a 
prominent faculty for best-guessing lexical patterns from the prosodic st*uc- 
ture when the phonetic stimulus is defective. “(Jr ambiguous (e.g., Cutler & 
Foss, 1977;. Huggins, 1978; Nakatani & Schaffer, 1978). Perhaps the listeners 
in our original study relied on such guesswork for transcribing the stimulus, 
and did- not immediately perceive the message from phonetic structure preserved 
in the time-varying tonal pattern. In that case, very little phonetic 
perception would have occurred, and our theoretical claim would, need ■•to be 
moderated.' 



In the tes£ we report’ here, each listener was presented with a sinusoidal 
pattern replicating the sentence. "Where .were you a year ago?" Ift response, 
the listener reported two things: (1) £ transcription- of the sentence; and 
, (2) a count of the syllables in the ser^ence. If phonetic information is 
preserved in trie coherence of the ‘changing sinusoids;** -then transcription 
performance should be no poorer* than syllable’ coupting, which would presumably 
Be based here on the .linguistic structure ofN the message. If, on the 
contrary, only prosodic ^ITfformation in the form of' amplitude variation is 
readily available tp theAistener, then syllable counting should be much more 
accurate than transcription of; the .message. ’ In this latter condition, 
subjects would be likely to vary in the particular phonetic guesses they make 
given that 3n infinity of sentences may conform to the same prosodic pattern. 



59 



O 

ERIC 



*» + 



6*1 



V 



r 



> 



The , present test also included a stimulus manipulation to evaluate more 
directly the difference between perceiving the phonetic structure and guessing 
about it based on amplitude information about prosody. Four conditions wer^^ 
„ used. In the first, listeners gave .their two responses to a sinusoidal/^ 

pattern that preserved both peak-frequency and peak-amplitude change of the 
first three formants of the original, natural utterance (see Figure 3). In, 
the second condition, listeners heard a pattern that pr£served the frequency 
variation of the first three formant center-frequencies at a constant level of 
energy throughout the utterance (see Figure 4). in the third condition, the 
sinusoidal pattern preserved the frequency pattern* of the first three for- 
mants, but with a grossly misleading ■'amplitude contour containThgt^^four 
segments of high energy and five segments of low energy, h\gh ancfTow-^. 
^differing by approximately 20dB (see Figure 5). The fourth condition employed • 
T » a sinusoidal pattern with the original formant amplitude variati-on but withjjno 
frequency variation (see Figure 6'). If the coarse amplitude structure of the 
stimuli provides reliable prosodic structure, and if subjects rely on this 
spurce of information about .the message, then syllable counting should be 
accurate in conditions 1 and 4, and poorer in ■ conditions 2 and 3. In 
addition, the Accuracy of transcription • should follow the accuracy of count- 
ing. If subjects perceive the phonetic sequence based on the time— varying 
properties of frequency variation, however, transcription and counting should 
. be good in all conditions but * the fourth,, in which there is no frequency 

.variation. * 



r 



• Our results are straightforward, as Figure 7 depicts. Transcription was 
good in conditions 1 (n=14), 2 (n=13) and 3 (n=12); there was no statistical 
effect of the amplitude manipulation in these condTtT^n^. This indicates that 
- subjects were not ’hindered by defective coarse acousticN str uc ture when fine 
acoustic structure was available for phonetic perception. (Condition 4 was 
not scored for transcription, for the obvious reason that'there was nothing 
phonetic to transcribe.) In the syllable counting task, there was an enormous 
difference between condition 4 (no frequency variation, appropriate amplitude 
variation) and the other three conditions (appropriate frequency variation 
with either normal, flat, or misleading amplitude variation). A post hoc 

cjpans test confirmed that' this effect is highly significant (Sdheffe, p<.001). 
Subjects were clearly unable to derive syllable information solely from 
amplitude variation in this case (cf. O'Malley & Peterson, 1966). -j. 



{ 



We conclude from these results that sinuspidal signals do not consist of 
veridical prosodic information ^nd' defective ^coustic-phonetic information. 
Listeners lacked, the ability to follow the syllable structure when "'&nly the 
amplitude variation, of the original transcribable pattern was preserved, yet 
they were able to apprehend the phonetic detail 'everTwhen. the energy contour 
was grossly inappropriate to .the segments within it. It seems that listeners 
who transcribed thesfe sinusoidal replicas of speech, must-have relied on e 
information about the phonetic sequence available in the f req uency vsriation 
alone. f 



Overall, these studies of sinusoidal signals contribute • new knowledge 
about phonetic perception that is perhaps counterintuitive. That is, phonetic 
perception can be elicited solely by a coherent pattern of acoustic variation 
. comprising elements that cannot, in principle, be realized vocally. In order 
to detect this coherence despite unproducible short-time spectra, listeners 





60 



P 

^ <.j 



/ 



<£<>UJU-OCC2 T3 i OJCLO CLUJC^C/}, 




“Where were you a year ago?" '* N 

NORMAL AMPLITUDE' 



— Figure 3 . Display of waveform, energy and frequency change of three-tone 
. replica of "Where were you a year ago?" Stimulus condition 1. 



* - 







dB 



L 4 

P * i 

C 













1 




— ■> 



P 

E 

A 



I 






X 



• •••• • 



j '0M# •» » 

I • 



••*••• _•• ••• 






k X / * \ / •**•—*•:* 

s T * 

^ ..................... ■„ .... •• •••. 

_ HHH** 

‘Where were you a year ago?" 



FLAT AMPLITUDE 
/■k 



Figure 4. Stimulus condition 2: variation in the frequency of the three 

.tones at a constant energy level'. 

* * -'67 



3 

ERIC 



/ 




“Where were you a year ago?” 



MISLEADING AMPLITUDE 

‘ ^ s 

Figure 5 . Stimulus condition 3: variation in ' the frequency of the three 

tones with a prosodically misleading amplitude pattern. 



O 

ERIC. . 

L - • ' 



* 




NO FREQUENCY- VARIATION NORMAL AMPLITUDE 



Figure 6. Stimulus condition 4: no frequency variation with the prosodically 

appropriate amplitude patt'ern. > ' 

o - * • --cn 

ERIC • 



■s - 



o 

LU 

CD 

CC 
O 
CO CO 
LU Z 
-J < 

co a: 

< h 



co 



o 

LU 

CO 

CO 

O 

o 



'V 

• / 



o 

LU 

H 

t r- 
O 
a. 

oc oc 

ffi CO 

5 a 

1 « 

2 < 



> 

CO 



7 

6 

5 

4 

.3 

2 

1 

0 



7 

6 

5 

4 

3 
2 
1 

0 






5.667 



5 

I 



.4.385 






I 



vNORMAL^ -FLAT MISLEADING, 

v 

AMPLITUDE 

WITH FREQUENCY VARIATION 



6.357 

i 



6.154 

I 

i 



6 500 



^NORMAL 



FLAT MISLEADING, 

— y — 






2.385 

rrq 

1 






AMPLITUDE 

WITH FREQUENCY VARIATION 



NORMAL 

AMPLITUDE 

NO 

FREQUENCY 



' 1 



AR 



VARIATION 



Figure 7. Top: group averages of transcription performance . Bottom: 



•averages’ of syllable counting 



group 

65 



7u 



y 



o 

ERIC 









ri 



P- 






must ultimately rely on^ even more abstract and more forgiving knowledge of 
vocal tracts ‘than has been proposed by 'Liberman (1979)* We venture to Say 
that; phonetic * perception may actually be based on attention to the coherent 
patterns of change * in acoustic energy rather than on attention to the. 
particular 1 qualities of the successive, discrete' acoustic elements that 
compose the speech signal. To refine our speculation, we must extend .this 
technique to a wider phonetic repertoire; to a more varied test of short-time 
spectral, properties that permit the 'effect to occur; and to ^manipulations of 
the coherence of change directly. ^ ^ * - 

* REFERENCES 

S 

Bailey, ^P. J., Summerfield, A. Q., <& Dorman, M. On the identification of 

* sine-wave analogues of -certain speech sounds. Haskirfe Laboratories 

Status Report on Speech Research , 1977, SR-51 /52 , 1-25. 

Best, C. T., Morrongiello , B., & Robson, R. 'Perceptual equivalence of 

acoustic cues in speech and nonspeech perception. Perception & 

s Psychophysics , 1.981, 2$^, 191-211 .’ " 

Cutler, A., & Foss, D. J. On the role of sentence stress in sentence- 

‘processing . ' Language - and Speech , 1977, 20, 1-10. 

Cutting, J. E* - Two left-hemisphere mechanisms in speech perception. 

Perception % Psychophysics , 1974, J6^, 601-6124 
Fant, G. Acoustic analysis and synthesis of speech with applications to 
Swedish.’ Ericsson Technics *, 1959, 1 5 , 3-108. 

Grunke ,° M. E. , fisoni^ I). P. Perceptual learning of mirrpr- image acoustic 
6 patterns. In E. Fischer-J^rgenson., J. Rischel,- Sc N.' .Thorsen j(Eds.), 

Proceedings of the Ninth International Congress of Phonetic Sciences 
(Vol. 2). Copenhagen: Institute of Phonetics, 1979, 461-467. 

1 Huggins, A. W. £. Speech timing and intelligibility.* In *3. Requin (Ed.), 

. Attention and performance VI 2. Hillsdale, N.jVf Lawrence Erlbaum Asso- 
’ dates, 1978, 279-298. * ^ 

Liberman,- A. M. How abstract must a motor theory .of speech perception be? 

, w Revue de Phonetique Applique^ , 1 979, 49/50 , 41 -58. ,* > ^ 

ft$katani, L. H. , <& Schaffer, J. A. Hearing ^words^-vi thout 4 words: Prosodic 

cufes for word perception'. Journal of the Acoustical * Society 6f America , 
1978, 63, 234-245. ; X ^ ", 

O’Malley, M. H. r <& Peterson, G. E. An experimental method .for prosodic 
' analysis. 'Phonetica, 1966, 1 5 , 1—13* 7 . 

Remez, R. E.:, Rubin, P. E. , Pisoni, D. B., & Carrell, T. D..- Speech perception 
•without traditipnal speech cues. Science , 1981, 21 2 /^947-950. 



7 



FOOTNOTES 



To our knowledge^ no one ^claims that the properties of a talker’s 
utterances "necessary to perception are supplied in the auditory channel, 
though such a view cannot be excluded a priori. * , 

^A very small number of listeners did recognize, some phonetic properties 
of the stimuli. 

-71 • 



66 



« 1 » 



O 

ERIC 









MEMORY FOR ITEM ORDER AND *PHO#ETIC RECODING IN THE BEGINNING READER* 



Robert B. K^tz*+ Donald Sl^nkweiler ,+ and Isabelle Y. Liberman+ 

e> \ 

i 

Abstract . A defect in immediate mdfoor-y for item order is often 
attributed to poor beginning readers. We have supposed that this 
' problem may be a manifestation of an underlying deficiency in the 
use of phonetic-' codes. Accordingly, we expected good and poor 

readers to differ in their ability to order stimuli that can be 
easily recoded as words and scored in phonetic form, but not,, in 
their ability t.o order nonlinguistic stimuli that do not lend 
themselves to phonetic recodipg in short-term memory. The purpose 
- • the present study was to tes}> this hypothesis by examining the 

ability of good and poor reader/] to reconstruct the order of sets of 
, , briefly presented stimuli that-^y^Pled i n the extent to which they 

* ^ could be distinctively recoded into phonetic " form : pictures of 

common objects versus nonrepr esentational , "doodle" drawings. As 
expected, an interaction between reading ability and type of stim- 
ulus item- was found, demonstrating the' material-specific nature of 
pooh readers’ ordering difficulties. These findings support the 
hypothesis that a function of the phonetic representation is to aid 
in retention of order information and that poor readers' ordering 
difficulties are related to their deficient use of phonetic codes. 

'•3 <J 

Certain commonly occurring memory problems^of poor beginning readers have 
been regarded as manifestations of an underlying deficiency in the use of 
phonetic codes. Several studies have shown that children who are poor readers 
tend to make ineffective ' use of phonetic coding in short-term recall of 
linguistic material (Liberman, Shankweiler, Liberman, Fowler, & Fischer r 1977; 

• Mann, Liberman, & Shankweiler., - 1980; Shankweiler, Liberman, Mark, Fowler,. A 
Fischer, 1979). However, special difficulties frith recall and recognition 
t. arise only when the stimulus items 'are words or other items that ca.n readily 
be labeled linguistically and retained phonetically in working memory (Hoimes 
& McKeever, 1979; Vellutino, Pruzek, Steger, A Meshoulam, 1973-;’ Vellutino, 
Steger, & Kandel, 1972). When the stimuli do not lend themselves to phonetic 
coding, Jthe performances of good and poor readers cannot be distinguished. 
For example, we (Liberman,. Mann, -Shankweiler, A Werfelman, Note 1) r Jested" 
recognition memppy^with two sets of stimuli that could, not be easily labeled; 



0 




*To appear in Journal of Exper imental Child Psychof^gy. 

+Also University of Connecticut. " 

Acknowledgment . This investigation was supported by NICHD Grant HD-0T99# and 
BRS Grant RR-05596 to Haskins Laboratories. We, are grateful to the, prindip^l 
arid teachers of the Parker Memorial School, Tolland, Connecticut for 
-allowing us to work with the second-grade classes. We '■are also grateful to 
the children who participated and their parents for their cooperation. 
Special thanks are due to Leonard Katz for statistical advice.. 

[HASKINS LABORATORIES: Status Report on Speech Research SR-66 (1981)] ** A 

- - . - • ' ' • ' ' 67 , 



V 



unfamiliar faces and abstract, nonrepresentational line drawings (Kimura, 
1963). ‘It was found that good and "poor readers were indistinguishable on 

memory, for both faces' and nonsense drawings. 

* > , «*■ 

‘The question we ask here is whether .children’ s memory for the order of 
occurrence of stimulus items would also vary with their phonetic recodability. 
Repeatedly, the literature has suggested that poor readers have difficulty in 
.retaining the order of items in tests of serial recall (Bakker, 1972; Benton, 

, 19^j Corkin, 1974). There are indi6ations, as we noted, that the poor 
r.elaerS' deficits in item recall may' be a manifestation of their deficient 
abilitj^to use phonetic c^des. We should now ask whether the deficits they 
might haife in remembering the order of stimuli would also vary with the 
phonetic Recodability of the items. This is what we would expect in light of 
suggestions that one function of phonetic memory codes is to preserve item 
order (Bad'deley,. 1978.; Crowder, 1978). Consequently, we would suppose that 
the poor-. reader*"s difficulty . in retaining order information is material- 
specific and not a" global Memory deficit for item order. 



To pursue this question experimentally, we needed to discover how poor 
. readers would fare with order memory for nonlinguistic material. While it is 
true that some studies (Corkin, 1974; Noelker 4 Schunsky, 1973; Stanley, 
, Kaplan, 4 Poole, 1975) have reported inferior performance by poor readers in 
ordering nonlinguistic stimuli, the interpretation of the findings in each 
case is open to some question either because the items used were such as to be 
readily labeled or were presented for long exposure times. In either 
instance, even though the stimuli presented, were nonlinguistic, the effect of 
the procedure might be to accentuate the differences in performance between 
the reader groups by encouraging linguistic recoding on the part of the good 
readers who habitually recode phonetically. Moreover, good and poor readers 
have. been found to be equivalent in ordering other nonlinguistic items, such 
as photographed faces (Holmes 4 McKeever, 1979). At all events, there has 
b'een no direct test of tfte hypothesis that the .poor readers’ problem with 
order memory may be linked to a deficiency in the use of phonetic codes. The 
present experiment was designed to provide direct evidence for such a link. 
By controlling for the ease with which linguistic labels can be given to test 
iterils, we expected tp find that differences in the performances of good and 
poor readers woul£ depend on the phonetic recodability of the stimulus 
material. 

# 

9 

The experiment compared good and poor readers’ memory order for two 
sets. or controlled stimuli: a set^consisting of items that are easily labeled 

— line drawings of -common objects, and a set containing items - presimed to be 
very difficult to label — Kimura’ s (T 963 ) nonsense drawings. The latter were 
chosen foV use in this study because gootf and poor readers performed eqGally 
well with these stimuli in the test of recognition memory to which we referred 
earlier (Liberman et al., Note 1). 7 



In the present procedure, a linear array of five figures is' 
tachistoscopically presented, after which copies of the five figures are 
presented, on cards, orfe figure per card, in random *<?rder. Subjects are asked 
to rearrange" the cards, reconstructing the ojj'der in. tRe previous displayV - 
Since ( poor readers tend not to, make full use of phonetic coding in working^ 
memory, we expected them to be less accurate. than good readers in ordering the 



68 



O 

ERIC 



*y'> 

4 KJ 









I 



i 



phonetically recodable pictures of common objects, but not to differ from the 
good readers in ordering the nonrecodable , doodle drawings. Thus we expected 
an interaction between reading ability and stimulus type, attributable to 
differences in the decree of reliance on phonetic recoding. 



METHOD 

j* 

Subjects , , . ' ' 

I * 

Subjects were selected from four second-grade classes in the Tolland, 
Connecticut public school System. Candidates for the- poor reader group were 
selected for screening if they were so designated by their teachers or if they 
scored at the 40th percentile or lower oh both word recognition subteSts of 
the^ 0 Comprehensive Test of Basic Skills (CTBS) (1974), which had been 
administered in the seventh month of the first grade. Candidates for the good 
reader group either received a -superior evaluation from the teachers or ranked 
at or above the 80th percentile on both CTBS subtests. 



f 



^ Subjects^ - s ^ ected fo r screening * were administered the Slosson 
In-telligence Tejy -(Slosson, 1963) and the word identification and the word 
attack subtests of the Woodcock Reading Mastery Tests (Woodcock, 1973) in the 
fifth and sixth months 'of the school year. The final good reader group 
consisted of those subjects who attained a combined raw score of at least 115 
on the two Woodcock subtests, while the poor Reader group included subjects 
with a combined score of less'than 85. Subjects with extreme 4 IQ scores (below 
90 or above 135 )' were- ineligible for further testing. In'addition, one poor 
reader had to be dropped because of prolonged absence and ensuing scheduling 
difficulties. By these criteria, 21 good readers (10 females; 11 males) and 
21 poor .readers (7 females, 14 males) were selected, the good readers had a 
mean age of 95.1 months compared to the poor readers' mean age of 97.2 months,. * 
v t(40) = 1.7; £ = .10. The good readers had a mean IQ of 115-3 while the poor- 
readers had a mean IQ of 107.4, t(40) = 2.7; £ = .012. Theraean combined raw 
score on the Woodcock was 134.6 for the good readers (range: 118»to 153) and 

53.0 for the poor readers (range: 22 to 77 ). 

Stimuli an'd Apparatus • 

Two sets of 50 \dr awing s comprised the stimuli of this study. The first 
set consisted oDt.he 5Q nonsense drawings of Kimura C 1 963 ) , which we designate 
phonetically unrecodable" because they are difficult to labe£, distinctively. 
The second set, which we call "phonetically recodable," included 50 line 
drawings of common objects. The latter had been shown- in earlier pilot 
3tudies to be easi-l.y recognized* by second graders, each drawing typically 
eliciting a single response which was a monosyllabic word. Each stimulus 

condition required 20 test trials'.. Each trial consisted of a tachistoscopic 
presentation of a different horizontal array of five stimuli mounted on 2 .x 2 
inch slides. To generate the, required 20 arrays for each condition, 10 arrays 
were selected by random drawing without replacement from the set of 50 stimuli 
_for that condition. Then 10 more arrays w e^e generated by a second drawing 
’■’for each stimulus condition.. One set of three ‘stimuli not used in the test 
trials was prepared to be used as practice trials. A sample . array .* for • 
each stimulus condition is displayed in Figure 1 . ' 



* 



69 



1 X 



1 



m 



i 



UNRECODABLE STIMULUS ARRAY 









flECODABLE STIMULUS ARRAY 






iv 

ERIC .* 





m 




Figure .1. 




the upplr portion of the figure gives a sample stif^lii%rarray 
consisting of five nonrepresentational line, drawings (adapted from 
Kimura, 1963) for vtoich ready verbal labels are not available. »The 
lower portion gives a sample array for the coraparisofr-condition in 
which the items are 'easily named common pbjects (adapted from. 
MakSr, 1969). ' ** ' 



75 






